Hager G, Zeiser T, Eitzinger J, Wellein G (2006)
Publication Language: English
Publication Type: Conference contribution, Original article
Publication year: 2006
Publisher: Springer-Verlag
Edited Volumes: Notes on Numerical Fluid Mechanics and Multidisciplinary Design
Series: Notes on Numerical Fluid Mechanics and Multidisciplinary Design (NNFM)
City/Town: Berlin Heidelberg
Book Volume: 91
Pages Range: 273-287
Conference Proceedings Title: Computational Science and High Performance Computing II
Event location: Stuttgart, Germany
ISBN: 978-3-540-31767-8
URI: http://www.springerlink.com/content/8401n54088177483/
We discuss basic optimization and parallelization strategies for current cache-based microprocessors (Intel Itanium2, Intel Netburst and AMD64 variants) in single-CPU and shared memory environments. Using selected kernel benchmarks representing data intensive applications we focus on the effective bandwidths attainable, which is still suboptimal using current compilers. We stress the need for a subtle OpenMP implementation even for simple benchmark programs, to exploit the high aggregate memory bandwidth available nowadays on ccNUMA systems. If the quality of main memory access is the measure, classical vector systems such as the NEC SX6+ are still a class of their own and are able to sustain the performance level of in-cache operations of modern microprocessors even with arbitrarily large data sets. © 2006 Springer-Verlag Berlin Heidelberg.
APA:
Hager, G., Zeiser, T., Eitzinger, J., & Wellein, G. (2006). Optimizing performance on modern HPC systems: learning from simple kernel benchmarks. In Computational Science and High Performance Computing II (pp. 273-287). Stuttgart, Germany, DE: Berlin Heidelberg: Springer-Verlag.
MLA:
Hager, Georg, et al. "Optimizing performance on modern HPC systems: learning from simple kernel benchmarks." Proceedings of the The 2nd Russian-German Advanced Research Workshop, Stuttgart, Germany Berlin Heidelberg: Springer-Verlag, 2006. 273-287.
BibTeX: Download