A Bregman Learning Framework for Sparse Neural Networks

Bungert L, Roith T, Tenbrinck D, Burger M (2022)


Publication Language: English

Publication Type: Journal article, Original article

Publication year: 2022

Journal

Open Access Link: https://www.jmlr.org/papers/v23/21-0545.html

Abstract

We propose a learning framework based on stochastic Bregman iterations, also known as mirror descent, to train sparse neural networks with an inverse scale space approach. We derive a baseline algorithm called LinBreg, an accelerated version using momentum, and AdaBreg, which is a Bregmanized generalization of the Adam algorithm. In contrast to established methods for sparse training the proposed family of algorithms constitutes a regrowth strategy for neural networks that is solely optimization-based without additional heuristics. Our Bregman learning framework starts the training with very few initial parameters, successively adding only significant ones to obtain a sparse and expressive network. The proposed approach is extremely easy and efficient, yet supported by the rich mathematical theory of inverse scale space methods. We derive a statistically profound sparse parameter initialization strategy and provide a rigorous stochastic convergence analysis of the loss decay and additional convergence proofs in the convex regime. Using only 3.4%" role="presentation" style="display: inline; line-height: normal; font-size: medium; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; color: rgb(0, 0, 0); font-family: "Times New Roman"; position: relative;">3.4%3.4% of the parameters of ResNet-18 we achieve 90.2%" role="presentation" style="display: inline; line-height: normal; font-size: medium; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; color: rgb(0, 0, 0); font-family: "Times New Roman"; position: relative;">90.2.2% test accuracy on CIFAR-10, compared to 93.6%" role="presentation" style="display: inline; line-height: normal; font-size: medium; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; color: rgb(0, 0, 0); font-family: "Times New Roman"; position: relative;">93.6“.6% using the dense network. Our algorithm also unveils an autoencoder architecture for a denoising task. The proposed framework also has a huge potential for integrating sparse backpropagation and resource-friendly training. Code is available at https://github.com/TimRoith/BregmanLearning.

Authors with CRIS profile

How to cite

APA:

Bungert, L., Roith, T., Tenbrinck, D., & Burger, M. (2022). A Bregman Learning Framework for Sparse Neural Networks. Journal of Machine Learning Research.

MLA:

Bungert, Leon, et al. "A Bregman Learning Framework for Sparse Neural Networks." Journal of Machine Learning Research (2022).

BibTeX: Download