Laukemann J, Hammer J, Hofmann J, Hager G, Wellein G (2019)
Publication Type: Conference contribution
Publication year: 2019
Publisher: Institute of Electrical and Electronics Engineers Inc.
Pages Range: 121-131
Conference Proceedings Title: Proceedings of PMBS 2018: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis
ISBN: 9781728101828
DOI: 10.1109/PMBS.2018.8641578
An accurate prediction of scheduling and execution of instruction streams is a necessary prerequisite for predicting the in-core performance behavior of throughput-bound loop kernels on out-of-order processor architectures. Such predictions are an indispensable component of analytical performance models, such as the Roofline and the Execution-Cache-Memory (ECM) model, and allow a deep understanding of the performance-relevant interactions between hardware architecture and loop code. We present the Open Source Architecture Code Analyzer (OSACA), a static analysis tool for predicting the execution time of sequential loops comprising x86 instructions under the assumption of an infinite first-level cache and perfect out-of-order scheduling. We show the process of building a machine model from available documentation and semi-automatic benchmarking, and carry it out for the latest Intel Skylake and AMD Zen micro-architectures. To validate the constructed models, we apply them to several assembly kernels and compare runtime predictions with actual measurements. Finally we give an outlook on how the method may be generalized to new architectures.
APA:
Laukemann, J., Hammer, J., Hofmann, J., Hager, G., & Wellein, G. (2019). Automated instruction stream throughput prediction for intel and AMD microarchitectures. In Proceedings of PMBS 2018: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 121-131). Dallas, TX, US: Institute of Electrical and Electronics Engineers Inc..
MLA:
Laukemann, Jan, et al. "Automated instruction stream throughput prediction for intel and AMD microarchitectures." Proceedings of the 2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, PMBS 2018, Dallas, TX Institute of Electrical and Electronics Engineers Inc., 2019. 121-131.
BibTeX: Download