Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks

Dörrich M, Fan M, Kist A (2023)

Publication Type: Journal article

Publication year: 2023

Journal

IEEE Access Institute of Electrical and Electronics Engineers (IEEE)

Book Volume: 11

Pages Range: 57627-57634

DOI: 10.1109/ACCESS.2023.3284388

Abstract

In the deep learning community, increasingly large models are being developed, leading to rapidly growing computational costs and energy costs. Recently, a new trend has been arising, advocating that researchers should also report the energy efficiency besides their model's performance in their papers. Previous research has shown that reduced precision can be helpful to improve energy efficiency. Based on this finding, we propose a simple practice to effectively improve the energy efficiency of training and inference, i.e., training the model with mixed precision and deploying it on Edge TPUs. We evaluated its effectiveness by comparing the speed-up of a state-of-the-art semantic segmentation architecture with respect to different typical usage scenarios, including using different devices, deep learning frameworks, model sizes, and batch sizes. Our results show that enabled mixed precision can gain up to a 1.9× speedup compared to the most common and default float32 data type on GPUs. Deploying the models on Edge TPU further boosted the inference by a factor of 6. Our approach allows researchers to accelerate their training and inference procedures without jeopardizing the model's accuracy, meanwhile reducing energy consumption and electricity cost easily without changing their model architecture or retraining. Furthermore, our approach is helpful in reducing the carbon footprint used to train and deploy the neural network and thus has a positive effect on environmental resources.

Authors with CRIS profile

Marion Dörrich Juniorprofessur für Artificial Intelligence in Communication Disorders Mingcheng Fan Juniorprofessur für Artificial Intelligence in Communication Disorders Andreas Kist Juniorprofessur für Artificial Intelligence in Communication Disorders

How to cite

APA:

Dörrich, M., Fan, M., & Kist, A. (2023). Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks. IEEE Access, 11, 57627-57634. https://dx.doi.org/10.1109/ACCESS.2023.3284388

MLA:

Dörrich, Marion, Mingcheng Fan, and Andreas Kist. "Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks." IEEE Access 11 (2023): 57627-57634.

BibTeX: Download