Skin cancer classification via convolutional neural networks: systematic review of studies involving human experts

Haggenmüller S, Maron RC, Hekler A, Utikal JS, Barata C, Barnhill RL, Beltraminelli H, Berking C, Betz-Stablein B, Blum A, Braun SA, Carr R, Combalia M, Fernandez-Figueras MT, Ferrara G, Fraitag S, French LE, Gellrich FF, Ghoreschi K, Goebeler M, Guitera P, Haenssle HA, Haferkamp S, Heinzerling L, Heppt M, Hilke FJ, Hobelsberger S, Krahl D, Kutzner H, Lallas A, Liopyris K, Llamas-Velasco M, Malvehy J, Meier F, Müller CS, Navarini AA, Navarrete-Dechent C, Perasole A, Poch G, Podlipnik S, Requena L, Rotemberg VM, Saggini A, Sangueza OP, Santonja C, Schadendorf D, Schilling B, Schlaak M, Schlager JG, Sergon M, Sondermann W, Soyer HP, Starz H, Stolz W, Vale E, Weyers W, Zink A, Krieghoff-Henning E, Kather JN, von Kalle C, Lipka DB, Fröhling S, Hauschild A, Kittler H, Brinker TJ (2021)

Publication Type: Journal article

Publication year: 2021

Journal

European Journal of Cancer Elsevier

Book Volume: 156

Pages Range: 202-216

DOI: 10.1016/j.ejca.2021.06.049

Abstract

Background: Multiple studies have compared the performance of artificial intelligence (AI)–based models for automated skin cancer classification to human experts, thus setting the cornerstone for a successful translation of AI-based tools into clinicopathological practice. Objective: The objective of the study was to systematically analyse the current state of research on reader studies involving melanoma and to assess their potential clinical relevance by evaluating three main aspects: test set characteristics (holdout/out-of-distribution data set, composition), test setting (experimental/clinical, inclusion of metadata) and representativeness of participating clinicians. Methods: PubMed, Medline and ScienceDirect were screened for peer-reviewed studies published between 2017 and 2021 and dealing with AI-based skin cancer classification involving melanoma. The search terms skin cancer classification, deep learning, convolutional neural network (CNN), melanoma (detection), digital biomarkers, histopathology and whole slide imaging were combined. Based on the search results, only studies that considered direct comparison of AI results with clinicians and had a diagnostic classification as their main objective were included. Results: A total of 19 reader studies fulfilled the inclusion criteria. Of these, 11 CNN-based approaches addressed the classification of dermoscopic images; 6 concentrated on the classification of clinical images, whereas 2 dermatopathological studies utilised digitised histopathological whole slide images. Conclusions: All 19 included studies demonstrated superior or at least equivalent performance of CNN-based classifiers compared with clinicians. However, almost all studies were conducted in highly artificial settings based exclusively on single images of the suspicious lesions. Moreover, test sets mainly consisted of holdout images and did not represent the full range of patient populations and melanoma subtypes encountered in clinical practice.

Authors with CRIS profile

Carola Berking Lehrstuhl für Haut- und Geschlechtskrankheiten Markus Heppt Department of Dermatology

Involved external institutions

Deutsches Krebsforschungszentrum (DKFZ)

Germany (DE) Royal Prince Alfred Hospital

Australia (AU) Heinrich-Heine-Universität Düsseldorf

Germany (DE) Institut Curie

France (FR) Berliner Institut für Gesundheitsforschung in der Charité / Berlin Institute of Health at Charité (BIH)

Germany (DE) Hospital Universitario Fundación Jiménez Díaz

Spain (ES) Ospedale San Bortolo

Italy (IT) Charité - Universitätsmedizin Berlin

Germany (DE) Universitätsklinikum Würzburg

Germany (DE) Memorial Sloan Kettering Cancer Center

United States (USA) (US) Klinikum der Universität München (LMU Klinikum)

Germany (DE) Nationales Centrum für Tumorerkrankungen (NCT)

Germany (DE) Universitätsklinikum Aachen (UKA)

Germany (DE) Bellvitge University Hospital / Hospital Universitari de Bellvitge

Spain (ES) Ruprecht-Karls-Universität Heidelberg

Germany (DE) Hospital da Luz

Portugal (PT) Universitätsklinikum Regensburg

Germany (DE) Hospital Universitario de La Princesa

Spain (ES) Hospital of Macerata / Ospedale di Macerata

Italy (IT) Universitat de Barcelona (UB) / University of Barcelona

Spain (ES) Universitätsklinikum Schleswig-Holstein (UKSH)

Germany (DE) Wake Forest University

United States (USA) (US) Universitätsspital Basel

Switzerland (CH) Universitätsklinikum Heidelberg

Germany (DE) Medizinische Universität Wien

Austria (AT) University of Queensland

Australia (AU) Université de Paris

France (FR) Institut für DermatoHistoPathologie Dres. Krahl & Partner

Germany (DE) Universitätsklinikum Essen

Germany (DE) Instituto Superior Técnico

Portugal (PT) Warwick Hospital

United Kingdom (GB) Inselspital, Universitätsspital Bern

Switzerland (CH) Technische Universität München (TUM)

Germany (DE) München Klinik gGmbH

Germany (DE)

How to cite

APA:

Haggenmüller, S., Maron, R.C., Hekler, A., Utikal, J.S., Barata, C., Barnhill, R.L.,... Brinker, T.J. (2021). Skin cancer classification via convolutional neural networks: systematic review of studies involving human experts. European Journal of Cancer, 156, 202-216. https://doi.org/10.1016/j.ejca.2021.06.049

MLA:

Haggenmüller, Sarah, et al. "Skin cancer classification via convolutional neural networks: systematic review of studies involving human experts." European Journal of Cancer 156 (2021): 202-216.

BibTeX: Download