Professur für Höchstleistungsrechnen


Beschreibung:

Die Forschungsaktivitäten der Professur sind an der Schnittstelle zwischen numerischer Anwendung und modernen parallelen Hochleistungsrechnern angesiedelt. Zentrales Arbeitsgebiet ist die effiziente Implementierung, Optimierung und Parallelisierung numerischer Methoden und Anwendungsprogrammen auf heterogenen, (hoch) parallelen Rechnern. Dabei werden innovative Optimierungs- und Parallelisierungsansätze entwickelt, welche sich an den besonderen Eigenschaften neuartiger Rechnerarchitekturen orientieren. Verfolgt wird bei den Forschungsarbeiten ein strukturierter Performancemodell-basierter Ansatz (Performance Engineering). Darüber hinaus werden einfache Werkzeuge entwickelt die den Prozess des Performance Engineering unterstützen. Anwendungsorientierte Schwerpunkte der Professur sind Stencil-basierte Applikationen sowie Basisoperationen und Eigenwertlöser für große dünn besetzte Systeme. Verbunden ist die Professur mit der Gruppenleitung der HPC Gruppe des Regionalen Rechenzentrums Erlangen. 

Adresse:
Martensstraße 3
91058 Erlangen


Forschungsbereiche

Hardwareeffiziente Bausteine für dünn besetzte lineare Algebra und stencil-basierten Verfahren
Performance Engineering
Werkzeuge für Performancemodellierung und Performanceanalyse


Forschungsprojekt(e)

Go to first page Go to previous page 2 von 3 Go to next page Go to last page

(SPP 1648: Software for Exascale Computing):
SPPEXA: EXASTEEL II - Bridging Scales for Multiphase Steels
Prof. Dr. Gerhard Wellein
(01.01.2016 - 31.12.2018)


(SPP 1648: Software for Exascale Computing):
TERRA-NEO - Integrated Co-Design of an Exascale Earth Mantle Modeling Framework
Prof. Dr. Gerhard Wellein
(01.11.2012 - 31.12.2015)


(SPP 1648: Software for Exascale Computing):
SPPEXA: ESSEX - Equipping Sparse Solvers for Exascale
Prof. Dr. Gerhard Wellein
(01.11.2012 - 31.12.2015)


(SPP 1648: Software for Exascale Computing):
EXASTEEL - Bridging Scales for Multiphase Steels
Prof. Dr. Gerhard Wellein
(01.11.2012 - 31.12.2015)


(SPP 1648: Software for Exascale Computing):
ESSEX - Equipping Sparse Solvers for Exascale
Dr. Georg Hager; Prof. Dr. Gerhard Wellein
(01.11.2012 - 30.06.2019)



Publikationen (Download BibTeX)

Go to first page Go to previous page 4 von 5 Go to next page Go to last page

Habich, J., Zeiser, T., Hager, G., & Wellein, G. (2011). Performance analysis and optimization strategies for a D3Q19 lattice Boltzmann kernel on nVIDIA GPUs using CUDA. In Advances in Engineering Software (pp. 266-272). ScienceDirect: Elsevier.
Eitzinger, J., Hager, G., & Wellein, G. (2010). LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments. In Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures (pp. 207-216). San Diego, CA, USA: IEEE: icppw.
Eitzinger, J., Hager, G., & Wellein, G. (2010). Complexities of Performance Prediction for Bandwidth-Limited Loop Kernels on Multi-Core Architectures. In High Performance Computing in Science and Engineering, Garching/Munich 2009. Leibniz Supercomputing Centre, Garching/Munich, Germany, DE: Berlin Heidelberg: Springer-Verlag.
Wittmann, M., Hager, G., Eitzinger, J., & Wellein, G. (2010). Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. Parallel Processing Letters, 20(4), 359-376. https://dx.doi.org/10.1142/S0129626410000296
Eitzinger, J., Hager, G., Wellein, G., & Meier, M. (2010). LIKWID performance tools. (pp. 50-53).
Feichtinger, C., Habich, J., Köstler, H., Hager, G., Rüde, U., & Wellein, G. (2010). A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters.
Zeiser, T., Hager, G., & Wellein, G. (2009). The world's fastest CPU and SMP node: Some performance results from the NEC SX-9. In Proceedings of the IEEE International Symposium on Parallel&Distributed Processing 2009 (pp. 1-8). Roma: IEEE Computer Society: ipdps.
Habich, J., Zeiser, T., Hager, G., & Wellein, G. (2009). Speeding up a Lattice Boltzmann Kernel on nVIDIA GPUs. In Proceedings of the First International Conference on Parallel, Distributed and Grid Computing for Engineering (pp. 17). Pécs, Hungary, HU: Kippen, Stirlingshire, United Kingdom: Civil-Comp Press.
Stürmer, M., Wellein, G., Hager, G., Köstler, H., & Rüde, U. (2009). Challenges and Potentials of Emerging Multicore Architectures. In High Performance Computing in Science and Engineering Garching-Munich 2007 (pp. 551-566). Garching: Berlin Heidelberg: Springer.
Zeiser, T., Hager, G., & Wellein, G. (2009). Vector Computers in a World of Commodity Clusters, Massively Parallel Systems and Many-Core Many-Threaded CPUs: Recent Experience Based on an Advanced Lattice Boltzmann Flow Solver. In High Performance Computing in Science and Engineering '08: Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2008. (pp. 333-347). Berlin Heidelberg: Springer.
Wellein, G., Hager, G., Zeiser, T., Wittmann, M., & Fehske, H. (2009). Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. In Proceedings of 2009 33rd Annual IEEE International Computer Software and Applications Conference (pp. 579-586). Seattle, USA: IEEE Computer Society: IPSJ/IEEE SAINT Conference, DOI 10.1109/COMPSAC.2009.82.
Hager, G., Stengel, H., Zeiser, T., & Wellein, G. (2009). RZBENCH: performance evaluation of current HPC architechtures using low-level and application benchmarks. In High Performance Computing in Science and Engineering, Garching/Munich 2007: Transactions of the Third Joint HLRB and KONWIHR Status and Result Workshop, Dec. 3-4, 2007, Leibniz Supercomputing Centre, Garching/Munich, Germany. (pp. 485-501). Berlin, Heidelberg: Springer.
Zeiser, T., Hager, G., & Wellein, G. (2009). Benchmark analysis and application results for lattice Boltzmann simulations on NEC SX vector and Intel Nehalem systems. Parallel Processing Letters, 19(4), 491-511. https://dx.doi.org/10.1142/S0129626409000389
Hager, G., Zeiser, T., & Wellein, G. (2008). Data access characteristics and optimizations for SUN ULTRASPARC T2 AND T2+ systems. Parallel Processing Letters, 18(4), 471-490. https://dx.doi.org/10.1142/S0129626408003521
Zeiser, T., Wellein, G., Iglberger, K., Rüde, U., Hager, G., & Nitsure, A. (2008). Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method. Progress in Computational Fluid Dynamics, 8(1-4), 179-188. https://dx.doi.org/10.1504/PCFD.2008.018088
Donath, S., Iglberger, K., Wellein, G., Zeiser, T., Nitsure, A., & Rüde, U. (2008). Performance comparison of different parallel lattice Boltzmann implementations on multi-core multi-socket systems. International Journal of Computational Science and Engineering, 4(1), 3-11. https://dx.doi.org/10.1504/IJCSE.2008.021107
Hager, G., Zeiser, T., & Wellein, G. (2008). Data access optimizations for highly threaded multi-core CPUs with multiple memory controllers. In Proceedings of the 2008 IEEE International Parallel & Distributed Processing Symposium (pp. 1-7). Miami, FL, USA: IEEE Catalog Number: CFP08023-CDR.
Breuer, M., Zeiser, T., Hager, G., Wellein, G., & Lammers, P. (2008). Direct numerical simulation of turbulent flow over dimples - Code optimization for NEC SX-8 plus flow results. In High Performance Computing in Science and Engineering '07: Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2007. (pp. 303-318). Berlin/Heidelberg: Springer.
Wellein, G., Hager, G., & Rüde, U. (2008). What's next? Evaluating Performance and Programming Approaches for Emerging Computer Technologies.
Bergen, B., Wellein, G., Hülsemann, F., & Rüde, U. (2007). Hierarchical hybrid grids: achieving TERAFLOP performance on large scale finite element simulations. International Journal of Parallel, Emergent and Distributed Systems, 22(4), 311-329. https://dx.doi.org/10.1080/17445760701442218


Zusätzliche Publikationen (Download BibTeX)


Shahzad, F., Thies, J., Kreutzer, M., Zeiser, T., Hager, G., & Wellein, G. (2018). CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance. IEEE Transactions on Parallel and Distributed Systems. https://dx.doi.org/10.1109/TPDS.2018.2866794

Zuletzt aktualisiert 2019-24-04 um 10:15