MOSP: Multi-Objective Sensitivity Pruning of Deep Neural Networks

Mishra A, Hannig F, Teich J, Sabih M (2022)

Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2022

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

City/Town: Pittsburgh, PA, USA

Pages Range: 1-8

Conference Proceedings Title: 2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)

Event location: Virtual

ISBN: 978-1-6654-6551-9

URI: https://ieeexplore.ieee.org/document/9969374

DOI: 10.1109/IGSC55832.2022.9969363

Open Access Link: https://www.computer.org/csdl/proceedings-article/igsc/2022/09969

Abstract

Deep neural networks (DNNs) are computationally intensive, making them difficult to deploy on resource-constrained embedded systems. Model compression is a set of techniques that removes redundancies from a neural network with affordable degradation in task performance. Most compression methods do not target hardware-based objectives such as latency directly; however, few methods approximate latency with floating-point operations (FLOPs) or multiply-accumulate operations (MACs). Using these indirect metrics cannot directly translate to the relevant performance metric on the hardware, i.e., latency and throughput. To address this limitation, we introduce Multi-Objective Sensitivity Pruning, “MOSP,” a three-stage pipeline for filter pruning: hardware-aware sensitivity analysis, Criteria-optimal configuration selection, and pruning based on explainable AI (XAI). Our pipeline is compatible with a single or combination of target objectives such as latency, energy consumption, and accuracy. Our method first formulates the sensitivity of layers of a model against the target objectives as a classical machine learning problem. Next, we choose a Criteria-optimal configuration controlled by hyperparameters specific to each objective of choice. Finally, we apply XAI-based filter ranking to select filters to be pruned. The pipeline follows an iterative pruning methodology to recover any loss in degradation in task performance (e.g., accuracy). We allow the user to prefer one objective function over the other. Our method outperforms the selected baseline method across different neural networks and datasets in both accuracy and latency reductions and is competitive with state-of-the-art approaches.

Authors with CRIS profile

Frank Hannig Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design) Jürgen Teich Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design) Muhammad Sabih Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)

Involved external institutions

Fraunhofer-Institut für Integrierte Schaltungen (IIS)

Germany (DE)

How to cite

APA:

Mishra, A., Hannig, F., Teich, J., & Sabih, M. (2022). MOSP: Multi-Objective Sensitivity Pruning of Deep Neural Networks. In IEEE (Eds.), 2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC) (pp. 1-8). Virtual: Pittsburgh, PA, USA: Institute of Electrical and Electronics Engineers (IEEE).

MLA:

Mishra, Ashutosh, et al. "MOSP: Multi-Objective Sensitivity Pruning of Deep Neural Networks." Proceedings of the The 13th International Green and Sustainable Computing Conference (IGSC), Virtual Ed. IEEE, Pittsburgh, PA, USA: Institute of Electrical and Electronics Engineers (IEEE), 2022. 1-8.

BibTeX: Download