Distilling Knowledge or Transferring Weights? An Experimental Perspective on Classifiers
Dosyalar
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
This study presents a systematic comparative analysis of knowledge distillation and transfer learning methodologies applied to image classification on the CIFAR-10 dataset. Using ResNet-18 architectures as the baseline, we investigate the trade-offs between model complexity, computational efficiency, and classification performance under various optimization strategies. Results demonstrate that knowledge distillation consistently outperforms transfer learning across all tested configurations. Most notably, a lightweight ResNet- 18 student model (2.84M parameters) guided by a ResNet-18 teacher achieved 89.03% accuracy, significantly exceeding transfer learning's 86.36% maximum accuracy despite using only 25% of the parameters. This improvement changes how we optimize models. It shows that using soft targets for knowledge transfer can beat the usual trade-off between a model's size and how well it performs. This makes it useful for places with limited resources.










