Third party funded individual grant
Acronym: NA³Os
Start date : 01.03.2024
End date : 28.02.2027
Embedded Machine Learning (ML) constitutes an admittedly fast-growing
field that comprises ML algorithms, hardware, and software capable of
performing on-device sensor data analyses at extremely low power,
enabling thus several always-on and battery-powered applications and
services. Running ML-based applications on embedded edge devices
witnesses a phenomenal research and business interest for many reasons,
including accessibility, privacy, latency, cost, and security. Embedded
ML is primarily represented by artificial intelligence (AI) at the edge
(EdgeAI) and on tiny, ultra resource constrained devices, a.k.a. TinyML.
TinyML poses requirements for energy efficiency but also low latency as
well as to retain accuracy in acceptable levels mandating, thus,
optimization of the software and hardware stack.
GPUs form the
default platform for DNN training workloads, due to their high
parallelism computing originating by the massive number of processing
cores. Though, GPU is often not an optimal solution for DNN inference
acceleration due to the high energy-cost and the lack of
reconfigurability, especially for high sparsity models or customized
architectures. On the other hand, Field Programmable Gate Arrays (FPGAs)
have a unique privilege of potentially lower latency and higher
efficiency than GPUs while offering high customization and faster
time-to-market combined with potentially longer useful life than ASIC
solutions.
In the context of TinyML, NA³Os focuses on a neural
approximate accelerator-architecture co-search targeting specifically
lightweight FPGA devices. This project investigates design techniques to
optimally and automatically map DNNs to resource- constrained FPGAs
while exploiting principles of approximate computing. Our particular
topics of investigation include: