Performance Evaluation of Low-Precision Quantized LeNet and ConvNet Neural Networks
Dosyalar
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
Low-precision neural network models are crucial for reducing the memory footprint and computational density. However, existing methods must have an average of 32-bit floatingpoint (FP32) arithmetic to maintain the accuracy. Floating-point numbers need grave memory requirements in convolutional and deep neural network models. Also, large bit-widths cause too much computational density in hardware architectures. Moreover, existing models must evolve into deeper network models with millions or billions of parameters to solve today’s problems. The large number of model parameters increase the computational complexity and cause memory allocation problems, hence existing hardware accelerators become insufficient to address these problems. In applications where accuracy can be tradedoff for the sake of hardware complexity, quantization of models enable the use of limited hardware resources to implement neural networks. From hardware design point of view, quantized models are more advantageous in terms of speed, memory and power consumption than using FP32. In this study, we compared the training and testing accuracy of the quantized LeNet and our own ConvNet neural network models at different epochs. We quantized the models using low precision int-4, int-8 and int-16. As a result of the tests, we observed that the LeNet model could only reach 63.59% test accuracy at 400 epochs with int-16. On the other hand, the ConvNet model achieved a test accuracy of 76.78% at only 40 epochs with low precision int-8 quantization.










