On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers

Deutel M, Hannig F, Mutschler C, Teich J (2024)

Publication Language: English

Publication Type: Journal article, Original article

Publication year: 2024

Journal

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Institute of Electrical and Electronics Engineers (IEEE)

URI: https://ieeexplore.ieee.org/document/10726519

DOI: 10.1109/TCAD.2024.3484354

Open Access Link: https://arxiv.org/abs/2407.10734

Abstract

On-device training of DNNs allows models to adapt and fine-tune to newly collected data or changing domains while deployed on microcontroller units (MCUs). However, DNN training is a resource-intensive task, making the implementation and execution of DNN training algorithms on MCUs challenging due to low processor speeds, constrained throughput, limited floating-point support, and memory constraints. In this work, we explore on-device training DNNs for different sized Cortex-M MCUs (Cortex-M0+, Cortex-M4, and Cortex-M7). We present a method that enables efficient training of DNNs completely in place on the MCU using fully quantized training (FQT) and dynamic partial gradient updates. We demonstrate the feasibility of our approach on multiple vision and time-series datasets and provide insights into the tradeoff between training accuracy, memory overhead, energy, and latency on real hardware. The results show that compared to related work, our approach requires 34.8% less memory and has a 49.0% lower latency per training sample, with dynamic partial gradient updates allowing a speedup of up to 8.7 compared to fully updating all weights.

Authors with CRIS profile

Mark Deutel Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design) Frank Hannig Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design) Christopher Mutschler
Jürgen Teich Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)

Related research project(s)

Neural Approximate Accelerator Architecture Optimization for DNN Inference on Lightweight FPGAs (NA³Os) May 1, 2024 - April 30, 2027

Involved external institutions

Fraunhofer-Institut für Integrierte Schaltungen (IIS)

Germany (DE)

How to cite

APA:

Deutel, M., Hannig, F., Mutschler, C., & Teich, J. (2024). On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. https://doi.org/10.1109/TCAD.2024.3484354

MLA:

Deutel, Mark, et al. "On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers." IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2024).

BibTeX: Download