AI Application Benchmarking: Power-Aware Performance Analysis for Vision and Language Models

Mayr M, Wind S, Schröder L, Hager G, Köstler H, Wellein G (2026)

Publication Status: In review

Publication Type: Unpublished / Preprint

Future Publication Type: Journal article

Publication year: 2026

Original Authors: Martin Mayr, Sebastian Wind, Lukas Schröder, Georg Hager, Harald Köstler, Gerhard Wellein

Abstract

Artificial Intelligence (AI) workloads drive a rapid expansion of high-performance computing (HPC) infrastructures and increase their power and energy demands towards a critical level. AI benchmarks representing state-of-the-art workloads and their understanding in the context of performance-energy trade-offs are critical to deploy efficient infrastructures and can guide energy efficiency measures, such as power limiting. We introduce a benchmarking framework with popular deep learning applications from computer vision (image classification and generation) and large language models (continued pre-training and inference) implementing modern methods. Our performance analysis focuses on throughput rather than ``time to completion'', which is the standard metric in HPC. We analyse performance and energy efficiency under various power-limit settings on NVIDIA H100, NVIDIA H200, and AMD MI300X GPUs. Our results reveal that no universal optimal power limit exists, as the efficiency peak varies across application types and GPU architectures. Interestingly, the two NVIDIA GPUs which mainly differ in their high-bandwidth memory (HBM) configuration show qualitatively different performance-energy trade-offs. Code is available on Zenodo (this https URL) and GitHub (this https URL).

Authors with CRIS profile

Martin Mayr Erlangen National High Performance Computing Center (NHR@FAU) Sebastian Wind Lehrstuhl für Informatik 14 (Bild- und Sprachverarbeitung) (LME) Lukas Schröder Erlangen National High Performance Computing Center (NHR@FAU) Georg Hager Regionales Rechenzentrum Erlangen (RRZE) Harald Köstler Lehrstuhl für Informatik 10 (Systemsimulation) (LSS) Gerhard Wellein Professur für Höchstleistungsrechnen

How to cite

APA:

Mayr, M., Wind, S., Schröder, L., Hager, G., Köstler, H., & Wellein, G. (2026). AI Application Benchmarking: Power-Aware Performance Analysis for Vision and Language Models. (Unpublished, In review).

MLA:

Mayr, Martin, et al. AI Application Benchmarking: Power-Aware Performance Analysis for Vision and Language Models. Unpublished, In review. 2026.

BibTeX: Download