Exploring performance and power properties of modern multi-core chips via simple machine models
Author(s): Wellein G, Habich J, Hager G, Eitzinger J
Publication year: 2016
Journal issue: 2
Pages range: 189-210
Modern multi-core chips show complex behavior with respect to performance and power. Starting with the Intel Sandy Bridge processor, it has become possible to directly measure the power dissipation of a CPU chip and correlate this data with the performance properties of the running code. Going beyond a simple bottleneck analysis, we employ the recently published Execution-Cache-Memory (ECM) model to describe the single-core and multi-core performance of streaming kernels. The model refines the well-known roofline model, because it can predict the scaling and the saturation behavior of bandwidth-limited loop kernels on a multi-core chip. The saturation point is especially relevant for considerations of energy consumption. From power dissipation measurements of benchmark programs with vastly different requirements to the hardware, we derive a simple, phenomenological power model for the Sandy Bridge processor. Together with the ECM model, we are able to explain many peculiarities in the performance and power behavior of multi-core processors and derive guidelines for energy-efficient execution of parallel programs. Finally, we show that the ECM and power models can be successfully used to describe the scaling and power behavior of a lattice Boltzmann flow solver code. Copyright (c) 2013 John Wiley & Sons, Ltd.
FAU Authors / FAU Editors Focus Area of Individual Faculties FAU Key Research Priorities How to cite
APA: Wellein, G., Habich, J., Hager, G., & Eitzinger, J. (2016). Exploring performance and power properties of modern multi-core chips via simple machine models. Concurrency and Computation-Practice & Experience, 28(2), 189-210. https://dx.doi.org/10.1002/cpe.3180
MLA: Wellein, Gerhard, et al. "Exploring performance and power properties of modern multi-core chips via simple machine models." Concurrency and Computation-Practice & Experience 28.2 (2016): 189-210.