Understanding metric-related pitfalls in image analysis validation

Reinke A, Tizabi MD, Baumgartner M, Eisenmann M, Heckmann-Nötzel D, Kavur AE, Rädsch T, Sudre CH, Acion L, Antonelli M, Arbel T, Bakas S, Benis A, Buettner F, Cardoso MJ, Cheplygina V, Chen J, Christodoulou E, Cimini BA, Farahani K, Ferrer L, Galdran A, van Ginneken B, Glocker B, Godau P, Hashimoto DA, Hoffman MM, Huisman M, Isensee F, Jannin P, Kahn CE, Kainmueller D, Kainz B, Karargyris A, Kleesiek J, Kofler F, Kooi T, Kopp-Schneider A, Kozubek M, Kreshuk A, Kurc T, Landman BA, Litjens G, Madani A, Maier-Hein K, Martel AL, Meijering E, Menze B, Moons KG, Müller H, Nichyporuk B, Nickel F, Petersen J, Rafelski SM, Rajpoot N, Reyes M, Riegler MA, Rieke N, Saez-Rodriguez J, Sánchez CI, Shetty S, Summers RM, Taha AA, Tiulpin A, Tsaftaris SA, Van Calster B, Varoquaux G, Yaniv ZR, Jäger PF, Maier-Hein L (2024)


Publication Type: Journal article

Publication year: 2024

Journal

Book Volume: 21

Pages Range: 182-194

Journal Issue: 2

DOI: 10.1038/s41592-023-02150-0

Abstract

Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.

Authors with CRIS profile

Involved external institutions

Oulun Yliopisto / University of Oulo FI Finland (FI) National Institutes of Health Clinical Center US United States (USA) (US) Technische Universität Wien / Vienna University of Technology AT Austria (AT) Katholieke Universiteit Leuven (KUL) / Catholic University of Leuven BE Belgium (BE) Inria Saclay - Île-de-France Research Centre FR France (FR) Deutsches Krebsforschungszentrum (DKFZ) DE Germany (DE) Université de Rennes 1 / University of Rennes 1 FR France (FR) Penn Medicine US United States (USA) (US) Allen Institute for Cell Science US United States (USA) (US) University of Warwick GB United Kingdom (GB) Universität Bern CH Switzerland (CH) Simula Metropolitan Center for Digital Engineering (SimulaMet) NO Norway (NO) Nvidia Corporation US United States (USA) (US) Ruprecht-Karls-Universität Heidelberg DE Germany (DE) University of Amsterdam NL Netherlands (NL) King’s College London GB United Kingdom (GB) McGill University CA Canada (CA) Institut de Chirurgie Guidée par l’Image de Strasbourg (IHU) / Institute of Image-Guided Surgery FR France (FR) Universitätsklinikum Essen DE Germany (DE) Google Ireland Limited IE Ireland (IE) University of Edinburgh GB United Kingdom (GB) National Institute of Allergy and Infectious Diseases US United States (USA) (US) University College London (UCL) GB United Kingdom (GB) Universidad de Buenos Aires (UBA) / University of Buenos Aires AR Argentina (AR) Indiana University – Purdue University Indianapolis US United States (USA) (US) Holon Institute of Technology IL Israel (IL) IT University of Copenhagen DE Germany (DE) Leibniz-Institut für Analytische Wissenschaften / Leibniz Institute for Analytical Sciences (ISAS) DE Germany (DE) Eli and Edythe L. Broad Institute of MIT and Harvard US United States (USA) (US) National Cancer Institute (NCI) US United States (USA) (US) Max-Delbrück-Centrum für Molekulare Medizin / Max Delbrück Center for Molecular Medicine (MDC) Berlin-Buch DE Germany (DE) Imperial College London / The Imperial College of Science, Technology and Medicine GB United Kingdom (GB) Quebec AI Institute / Quebec Artificial Intelligence Institute / Montreal Institute for Learning Algorithms (MILA) CA Canada (CA) Universitätsklinikum Hamburg-Eppendorf (UKE) DE Germany (DE) Universitat Pompeu Fabra (UPF) ES Spain (ES) Helmholtz-Gemeinschaft / Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. DE Germany (DE) Lunit Inc. KR Korea, Republic of (KR) Masaryk University CZ Czech Republic (CZ) Fraunhofer-Institut für Bildgestützte Medizin (MEVIS) DE Germany (DE) European Molecular Biology Laboratory (EMBL) DE Germany (DE) State University of New York at Albany (UNY Albany / UAlbany) US United States (USA) (US) Perelman School of Medicine University of Pennsylvania US United States (USA) (US) Princess Margaret Cancer Centre / Princess Margaret Hospital CA Canada (CA) Radboud University Nijmegen Medical Centre / Radboudumc of voluit Radboud Universitair Medisch Centrum (UMC) NL Netherlands (NL) Vanderbilt University US United States (USA) (US) St. Luke's University Health Network (SLUHN) US United States (USA) (US) University of Toronto CA Canada (CA) University of New South Wales (UNSW) AU Australia (AU) University of Zurich / Universität Zürich (UZH) CH Switzerland (CH) University Medical Centre Utrecht (UMC Utrecht) NL Netherlands (NL) Haute école spécialisée de Suisse occidentale (HES-SO) / Fachhochschule Westschweiz CH Switzerland (CH)

How to cite

APA:

Reinke, A., Tizabi, M.D., Baumgartner, M., Eisenmann, M., Heckmann-Nötzel, D., Kavur, A.E.,... Maier-Hein, L. (2024). Understanding metric-related pitfalls in image analysis validation. Nature Methods, 21(2), 182-194. https://doi.org/10.1038/s41592-023-02150-0

MLA:

Reinke, Annika, et al. "Understanding metric-related pitfalls in image analysis validation." Nature Methods 21.2 (2024): 182-194.

BibTeX: Download