Professur für Höchstleistungsrechnen


Description:

Die Forschungsaktivitäten der Professur sind an der Schnittstelle zwischen numerischer Anwendung und modernen parallelen Hochleistungsrechnern angesiedelt. Zentrales Arbeitsgebiet ist die effiziente Implementierung, Optimierung und Parallelisierung numerischer Methoden und Anwendungsprogrammen auf heterogenen, (hoch) parallelen Rechnern. Dabei werden innovative Optimierungs- und Parallelisierungsansätze entwickelt, welche sich an den besonderen Eigenschaften neuartiger Rechnerarchitekturen orientieren. Verfolgt wird bei den Forschungsarbeiten ein strukturierter Performancemodell-basierter Ansatz (Performance Engineering). Darüber hinaus werden einfache Werkzeuge entwickelt die den Prozess des Performance Engineering unterstützen. Anwendungsorientierte Schwerpunkte der Professur sind Stencil-basierte Applikationen sowie Basisoperationen und Eigenwertlöser für große dünn besetzte Systeme. Verbunden ist die Professur mit der Gruppenleitung der HPC Gruppe des Regionalen Rechenzentrums Erlangen. 

Address:
Martensstraße 3
91058 Erlangen


Research Fields

Performance Engineering
Werkzeuge für Performancemodellierung und Performanceanalyse
Hardwareeffiziente Bausteine für dünn besetzte lineare Algebra und stencil-basierten Verfahren


Related Project(s)

Go to first page Go to previous page 1 of 3 Go to next page Go to last page

(Energy Oriented Center of Excellence: toward exascale for energy):
EoCoE-II: Energy Oriented Center of Excellence: toward exascale for energy (Performance evaluation, modelling and optimization)
Prof. Dr. Gerhard Wellein
(01/01/2019 - 31/12/2021)


SeASiTe: Selbstadaption für zeitschrittbasierte Simulationstechniken auf heterogenen HPC-Systemen
Prof. Dr. Gerhard Wellein
(01/03/2017 - 29/02/2020)


ProPE: Process-Oriented Performance Engineering Service Infrastructure for Scientific Software at German HPC Centers
Prof. Dr. Gerhard Wellein
(01/01/2017 - 31/12/2019)


MeTacca: Metaprogrammierung für Beschleunigerarchitekturen
Prof. Dr. Gerhard Wellein; Prof. Dr. Harald Köstler
(01/01/2017 - 31/12/2019)


(SPP 1648: Software for Exascale Computing):
SPPEXA: Equipping Sparse Solvers for Exascale II (ESSEX-II)
Prof. Dr. Gerhard Wellein
(01/01/2016 - 31/12/2018)



Publications (Download BibTeX)

Go to first page Go to previous page 2 of 5 Go to next page Go to last page

Anzt, H., Kreutzer, M., Ponce, E., Peterson, G.D., Wellein, G., & Dongarra, J. (2016). Optimization and performance evaluation of the IDR iterative Krylov solver on GPUs. International Journal of High Performance Computing Applications. https://dx.doi.org/10.1177/1094342016646844
Hofmann, J., Fey, D., Eitzinger, J., Hager, G., & Wellein, G. (2016). Analysis of Intel's Haswell Microarchitecture Using the ECM Model and Microbenchmarks. In Architecture of Computing Systems -- ARCS 2016: 29th International Conference, Nuremberg, Germany, April 4-7, 2016, Proceedings (pp. 210-222). Nuremberg: Cham: Springer International Publishing.
Bauer, S., Bunge, H.-P., Drzisga, D.P., Gmeiner, B., Huber, M., John, L.,... Wohlmuth, B.I. (2016). Hybrid Parallel Multigrid Methods for Geodynamical Simulations. In Bungartz H., Neumann P., Nagel E. (Eds.), 113. (pp. 211-235). Berlin, Heidelberg, New York: Springer.
Feichtinger, C., Habich, J., Köstler, H., Rüde, U., & Aoki, T. (2015). Performance Modeling and Analysis of Heterogeneous Lattice Boltzmann Simulations on CPU-GPU Clusters. Parallel Computing, 46, 1-13. https://dx.doi.org/10.1016/j.parco.2014.12.003
Gmeiner, B., Rüde, U., Stengel, H., Waluga, C., & Wohlmuth, B.I. (2015). Towards Textbook Efficiency for Parallel Multigrid. Numerical Mathematics-Theory Methods and Applications, 8(1), 22-46. https://dx.doi.org/10.4208/nmtma.2015.w10si
Wittmann, M., Hager, G., Zeiser, T., Treibig, J., & Wellein, G. (2015). Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations. Concurrency and Computation-Practice & Experience, 1-5. https://dx.doi.org/10.1002/cpe.3489
Shahzad, F., Kreutzer, M., Zeiser, T., Machado, R., Pieper, A., Hager, G., & Wellein, G. (2015). Building a Fault Tolerant Application Using the GASPI Communication Layer. In Proceedings of FTS 2015 (pp. 580-587). Chicago, IL: in conjunction with IEEE Cluster 2015: IEEE.
Hammer, J., Hager, G., Eitzinger, J., & Wellein, G. (2015). Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft. In Proceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems (pp. 1-11). Austin, TX, USA: New York, NY, USA: ACM.
Wellein, G., Eitzinger, J., Hager, G., & Röhl, T. (2015). Overhead Analysis of Performance Counter Measurements. In Proceedings of the 43rd International Conference on Parallel Processing Workshops, ICPPW 2014 (pp. 176-185). Institute of Electrical and Electronics Engineers Inc..
Wellein, G., Hager, G., Stengel, H., Keyes, D., Malas, T., & Ltaief, H. (2015). Multicore-optimized wavefront diamond blocking for optimizing stencil updates. SIAM Journal on Scientific Computing, 37(4), C439-C464. https://dx.doi.org/10.1137/140991133
Hammer, J., Hager, G., Eitzinger, J., & Wellein, G. (2015). Automatic loop kernel analysis and performance modeling with kerncraft. In Proceedings of the 6th International Workshop in Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, PMBS 2015 - Held as part of the 27th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015. Association for Computing Machinery, Inc.
Gmeiner, B., Rüde, U., Stengel, H., Waluga, C., & Wohlmuth, B.I. (2015). Performance and Scalability of Hierarchical Hybrid Multigrid Solvers for Stokes Systems. SIAM Journal on Scientific Computing, 37(2), C143 - C 168. https://dx.doi.org/10.1137/130941353
Hofmann, J., Fey, D., Eitzinger, J., Hager, G., & Wellein, G. (2015). Performance analysis of the Kahan-enhanced scalar product on current multicore processors. In Accepted for PPAM 2015 (pp. 1-10). Krakow, Poland, PL.
Kreutzer, M., Hager, G., Wellein, G., Alvermann, A., Fehske, H., & Pieper, A. (2015). Performance Engineering of the Kernel Polynomal Method on Large-Scale CPU-GPU Systems. In IEEE (Eds.), Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (pp. 417-426). Hyderabad, India, IN.
Röhrig-Zöllner, M., Thies, J., Kreutzer, M., Alvermann, A., Pieper, A., Basermann, A.,... Fehske, H. (2015). Increasing the performance of the Jacobi-Davidson method by blocking. SIAM Journal on Scientific Computing, DLR Portal ISSN 1064-8275, 1-27. https://dx.doi.org/10.1137/140976017
Klawonn, A., Lanser, M., Rheinbach, O., Stengel, H., & Wellein, G. (2015). Hybrid MPI/OpenMP Parallelization in FETI-DP Methods. In Recent Trends in Computational Engineering - CE2014. (pp. 67-84). -: Springer Link.
Kreutzer, M., Hager, G., Wellein, G., Fehske, H., & Bishop, A.R. (2014). A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM Journal on Scientific Computing, 36(5), C401–C423. https://dx.doi.org/10.1137/130930352
Wittmann, M., Zeiser, T., Hager, G., & Wellein, G. (2014). Modeling and analyzing performance for highly optimized propagation steps of the lattice Boltzmann method on sparse lattices.
Hofmann, J., Eitzinger, J., Hager, G., & Wellein, G. (2014). Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips. In Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing (pp. 57-64). Orlando, USA: New York, NY, USA: ACM.
Hofmann, J., Eitzinger, J., Hager, G., & Wellein, G. (2014). Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator. In ARCS Workshops'14 (pp. 1-8). Lübeck, Germany, DE.


Publications in addition (Download BibTeX)


Shahzad, F., Thies, J., Kreutzer, M., Zeiser, T., Hager, G., & Wellein, G. (2018). CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance. IEEE Transactions on Parallel and Distributed Systems. https://dx.doi.org/10.1109/TPDS.2018.2866794

Last updated on 2019-24-04 at 10:15