A SIMD MAC RISC-V Extension with Approximate Multipliers for Accelerating CNN Inference in Tiny Embedded Devices

Hernandez Morales JJ, Hannig F, Teich J (2025)


Publication Language: English

Publication Type: Conference contribution, Original article

Publication year: 2025

Publisher: Springer Cham

Series: Lecture Notes in Computer Science

Conference Proceedings Title: Architecture of Computing Systems, 38th International Conference, ARCS 2025, Kiel, Germany, April 22–24, 2025, Proceedings

Event location: Kiel DE

Abstract

Deploying deep neural networks in ultra-low-power tiny embedded devices has inspired research on compression techniques such as pruning, mixed-precision quantization, and approximation-aware training methods to reduce memory requirements and computational complexity during inference. However, most tiny processors or microcontrollers currently employed for the inference task do not include support for vector or sub-byte integer arithmetic operations, such as those utilized in quantized convolutional neural network (CNN) models. Hence, they need to run programs with additional instructions for packing and unpacking coefficients. Focusing on the multiply-accumulate operations that dominate CNN runtime, we present a SIMD (single instruction, multiple data) accelerator tightly coupled into a RISC-V processor pipeline. This accelerator is capable of receiving packed coefficients in 8-bit and 4-bit formats and outputting their dot product. Moreover, to reduce hardware costs and lower the latency of the SIMD unit, we propose an approximate multiplier structure which considers shared resources for 8x8-bit and 4x4-bit multiplications. Additionally, the level of approximation can be configured at synthesis time to trade hardware resources off with accuracy.

Authors with CRIS profile

Related research project(s)

How to cite

APA:

Hernandez Morales, J.J., Hannig, F., & Teich, J. (2025). A SIMD MAC RISC-V Extension with Approximate Multipliers for Accelerating CNN Inference in Tiny Embedded Devices. In Architecture of Computing Systems, 38th International Conference, ARCS 2025, Kiel, Germany, April 22–24, 2025, Proceedings. Kiel, DE: Springer Cham.

MLA:

Hernandez Morales, Jose Juan, Frank Hannig, and Jürgen Teich. "A SIMD MAC RISC-V Extension with Approximate Multipliers for Accelerating CNN Inference in Tiny Embedded Devices." Proceedings of the 38th GI/ITG International Conference on Architecture of Computing Systems, Kiel Springer Cham, 2025.

BibTeX: Download