Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips

Beitrag bei einer Tagung


Details zur Publikation

Autorinnen und Autoren: Hofmann J, Eitzinger J, Hager G, Wellein G
Titel Sammelwerk: WPMVP 2014 - Proceedings of the 2014 ACM SIGPLAN Workshop on Programming Models for SIMD/Vector Processing, Co-located with PPoPP 2014
Verlag: ACM
Verlagsort: New York, NY, USA
Jahr der Veröffentlichung: 2014
Tagungsband: Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing
Seitenbereich: 57-64
ISBN: 978-1-4503-2653-7


Abstract


Single Instruction, Multiple Data (SIMD) vectorization is a major driver of performance in current architectures, and is mandatory for achieving good performance with codes that are limited by instruction throughput. We investigate the efficiency of different SIMDvectorized implementations of the RabbitCT benchmark. RabbitCT performs 3D image reconstruction by back projection, a vital operation in computed tomography applications. The underlying algorithm is a challenge for vectorization because it consists, apart from a streaming part, also of a bilinear interpolation requiring scattered access to image data.We analyze the performance of SSE (128 bit), AVX (256 bit), AVX2 (256 bit), and IMCI (512 bit) implementations on recent Intel x86 systems. A special emphasis is put on the vector gather implementation on Intel Haswell and Knights Corner microarchitectures. Finally we discuss why GPU implementations perform much better for this specific algorithm. Copyright © 2014 ACM.



FAU-Autorinnen und Autoren / FAU-Herausgeberinnen und Herausgeber

Eitzinger, Jan Dr.
Regionales Rechenzentrum Erlangen (RRZE)
Hager, Georg Dr.
Regionales Rechenzentrum Erlangen (RRZE)
Hofmann, Johannes
Lehrstuhl für Informatik 3 (Rechnerarchitektur)
Wellein, Gerhard Prof. Dr.
Professur für Höchstleistungsrechnen


Zitierweisen

APA:
Hofmann, J., Eitzinger, J., Hager, G., & Wellein, G. (2014). Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips. In Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing (pp. 57-64). Orlando, USA: New York, NY, USA: ACM.

MLA:
Hofmann, Johannes, et al. "Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips." Proceedings of the 2014 1st ACM SIGPLAN Workshop on Programming Models for SIMD/Vector Processing, WPMVP 2014 - Co-located with PPoPP 2014, Orlando, USA New York, NY, USA: ACM, 2014. 57-64.

BibTeX: 

Zuletzt aktualisiert 2019-23-07 um 07:53