ConvNeXt vs EfficientNet for Deepfake Detection

Two-stage transfer learning comparison on 190,335 facial images. ConvNeXt-Tiny achieved 92.9% test accuracy with Grad-CAM explainability, outperforming EfficientNetV2-S on real-world generalization despite a lower peak validation score.

Group members: Nabonita Das, Ali Hamza, Akash Patel · CIS 579: Artificial Intelligence · University of Michigan–Dearborn

Open in Colab

ConvNeXt-Tiny

Winner

27.8M params

Accuracy92.94%

Precision88.69%

Recall98.54%

F1 Score93.36%

Best val accuracy: 98.68%

EfficientNetV2-S

20.2M params

Accuracy86.59%

Precision79.81%

Recall98.23%

F1 Score88.07%

Best val accuracy: 98.76%

Test-Set Metric Comparison

ConvNeXt-Tiny — Confusion Matrix

5,412

True Real (TN)

49.6%

False Fake (FP)

0.7%

690

False Real (FN)

6.3%

4,723

True Fake (TP)

43.3%

Predicted: Real / Fake → · Actual: Real (top) / Fake (bottom) ↓

EfficientNetV2-S — Confusion Matrix

5,395

True Real (TN)

49.5%

False Fake (FP)

0.9%

1,365

False Real (FN)

12.5%

4,048

True Fake (TP)

37.1%

Predicted: Real / Fake → · Actual: Real (top) / Fake (bottom) ↓

Full Test-Set Summary

Model	Accuracy	Precision	Recall	F1	FP	FN
ConvNeXt-Tiny	92.94%	88.69%	98.54%	93.36%	80	690
EfficientNetV2-S	86.59%	79.81%	98.23%	88.07%	97	1,365

Test set: 5,413 Real + 5,492 Fake = 10,905 images. FP = real images classified as fake. FN = fake images classified as real.