Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification

Jungmann F, Ziegelmayer S, Lohoefer FK, Metz S, Mueller-Leisse C, Englmaier M, Makowski MR, Kaissis GA, Braren RF (2023)

Publication Type: Journal article

Publication year: 2023

Journal

European Radiology Springer Verlag (Germany)

Book Volume: 33

Pages Range: 1844-1851

Journal Issue: 3

DOI: 10.1007/s00330-022-09165-9

Abstract

Objective: To evaluate the perception of different types of AI-based assistance and the interaction of radiologists with the algorithm’s predictions and certainty measures. Methods: In this retrospective observer study, four radiologists were asked to classify Breast Imaging-Reporting and Data System 4 (BI-RADS4) lesions (n = 101 benign, n = 99 malignant). The effect of different types of AI-based assistance (occlusion-based interpretability map, classification, and certainty) on the radiologists’ performance (sensitivity, specificity, questionnaire) were measured. The influence of the Big Five personality traits was analyzed using the Pearson correlation. Results: Diagnostic accuracy was significantly improved by AI-based assistance (an increase of 2.8% ± 2.3%, 95 %-CI 1.5 to 4.0 %, p = 0.045) and trust in the algorithm was generated primarily by the certainty of the prediction (100% of participants). Different human-AI interactions were observed ranging from nearly no interaction to humanization of the algorithm. High scores in neuroticism were correlated with higher persuasibility (Pearson’s r = 0.98, p = 0.02), while higher consciousness and change of accuracy showed an inverse correlation (Pearson’s r = −0.96, p = 0.04). Conclusion: Trust in the algorithm’s performance was mostly dependent on the certainty of the predictions in combination with a plausible heatmap. Human-AI interaction varied widely and was influenced by personality traits. Key Points: • AI-based assistance significantly improved the diagnostic accuracy of radiologists in classifying BI-RADS 4 mammography lesions. • Trust in the algorithm’s performance was mostly dependent on the certainty of the prediction in combination with a reasonable heatmap. • Personality traits seem to influence human-AI collaboration. Radiologists with specific personality traits were more likely to change their classification according to the algorithm’s prediction than others.

Involved external institutions

Technische Universität München (TUM)

Germany (DE)

How to cite

APA:

Jungmann, F., Ziegelmayer, S., Lohoefer, F.K., Metz, S., Mueller-Leisse, C., Englmaier, M.,... Braren, R.F. (2023). Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification. European Radiology, 33(3), 1844-1851. https://doi.org/10.1007/s00330-022-09165-9

MLA:

Jungmann, Friederike, et al. "Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification." European Radiology 33.3 (2023): 1844-1851.

BibTeX: Download