Paper
27 September 2024 A nonlinear convolution neural network quantization method
Lou Zheng, Fan Lei, Tao Weisong, Xu Chao
Author Affiliations +
Proceedings Volume 13281, International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2024); 132810W (2024) https://doi.org/10.1117/12.3051024
Event: International Conference on Cloud Computing, Performance Computing, and Deep Learning, 2024, Zhengzhou, China
Abstract
In today's world, convolution neural networks possess tremendous potential and exhibit remarkable recognition rates in computer vision. However, neural network models are often so large that they are not well-suited for mobile devices, making model compression particularly important. Especially in scenarios with limited hardware resources such as the National Grid, model quantification is essential. This paper introduces an INT8 quantization method for convolution neural networks, which transforms the weights and input/output of the convolution neural network into the logarithmic domain through a non-linear formula. This method provides a better representation of the distribution of small values, resulting in less accuracy loss. Meanwhile, model calibration requires only a limited amount of image data, which reduces the time taken for model calibration while ensuring minimal accuracy loss. Experimental results show that our quantization method results in an accuracy loss of less than 1% on ImageNet and CIFAR.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Lou Zheng, Fan Lei, Tao Weisong, and Xu Chao "A nonlinear convolution neural network quantization method", Proc. SPIE 13281, International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2024), 132810W (27 September 2024); https://doi.org/10.1117/12.3051024
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Quantization

Convolution

Neural networks

Data modeling

Education and training

Calibration

Performance modeling

Back to Top