Fröhlich J, Rosenberger H, Müller R (2025)
Publication Type: Conference contribution
Publication year: 2025
Publisher: Institute of Electrical and Electronics Engineers Inc.
Pages Range: 352-357
Conference Proceedings Title: Proceedings - 2025 IEEE Conference on Artificial Intelligence, CAI 2025
Event location: Santa Clara, CA, USA
ISBN: 9798331524005
DOI: 10.1109/CAI64502.2025.00064
The main contributor to the computational cost of the inference phase of neural networks is the constant matrixvector multiplication. Linear computation coding has emerged as a novel approach for reducing the complexity of constant matrix-vector multiplications. In this paper we demonstrate that linear computation coding methods are compatible with pruning of large language models and reduce the computational cost of the inference phase beyond what pruning alone can achieve. We focus on the perplexity as a metric to identify columns suitable for pruning. Layers with lower importance are pruned in a systematic columnwise manner. This structured pruning approach directly benefits the linear computation coding algorithms for an additional gain. Applying fine-tuning to the pruned models leads up to a 4.5 -fold decrease of the computational cost while maintaining over 85 % of the zero-shot accuracy of several natural language tasks. We also discuss the trade-off between the computational cost and the performance of large language models.
APA:
Fröhlich, J., Rosenberger, H., & Müller, R. (2025). Spicing Up LLMs: The Role of PAPRICA Pruning in Linear Computation Coding. In Proceedings - 2025 IEEE Conference on Artificial Intelligence, CAI 2025 (pp. 352-357). Santa Clara, CA, USA: Institute of Electrical and Electronics Engineers Inc..
MLA:
Fröhlich, Johanna, Hans Rosenberger, and Ralf Müller. "Spicing Up LLMs: The Role of PAPRICA Pruning in Linear Computation Coding." Proceedings of the 3rd IEEE Conference on Artificial Intelligence, CAI 2025, Santa Clara, CA, USA Institute of Electrical and Electronics Engineers Inc., 2025. 352-357.
BibTeX: Download