Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification

Jungmann F, Ziegelmayer S, Lohoefer FK, Metz S, Mueller-Leisse C, Englmaier M, Makowski MR, Kaissis GA, Braren RF (2023)


Publication Type: Journal article

Publication year: 2023

Journal

Book Volume: 33

Pages Range: 1844-1851

Journal Issue: 3

DOI: 10.1007/s00330-022-09165-9

Abstract

Objective: To evaluate the perception of different types of AI-based assistance and the interaction of radiologists with the algorithm’s predictions and certainty measures. Methods: In this retrospective observer study, four radiologists were asked to classify Breast Imaging-Reporting and Data System 4 (BI-RADS4) lesions (n = 101 benign, n = 99 malignant). The effect of different types of AI-based assistance (occlusion-based interpretability map, classification, and certainty) on the radiologists’ performance (sensitivity, specificity, questionnaire) were measured. The influence of the Big Five personality traits was analyzed using the Pearson correlation. Results: Diagnostic accuracy was significantly improved by AI-based assistance (an increase of 2.8% ± 2.3%, 95 %-CI 1.5 to 4.0 %, p = 0.045) and trust in the algorithm was generated primarily by the certainty of the prediction (100% of participants). Different human-AI interactions were observed ranging from nearly no interaction to humanization of the algorithm. High scores in neuroticism were correlated with higher persuasibility (Pearson’s r = 0.98, p = 0.02), while higher consciousness and change of accuracy showed an inverse correlation (Pearson’s r = −0.96, p = 0.04). Conclusion: Trust in the algorithm’s performance was mostly dependent on the certainty of the predictions in combination with a plausible heatmap. Human-AI interaction varied widely and was influenced by personality traits. Key Points: • AI-based assistance significantly improved the diagnostic accuracy of radiologists in classifying BI-RADS 4 mammography lesions. • Trust in the algorithm’s performance was mostly dependent on the certainty of the prediction in combination with a reasonable heatmap. • Personality traits seem to influence human-AI collaboration. Radiologists with specific personality traits were more likely to change their classification according to the algorithm’s prediction than others.

Involved external institutions

How to cite

APA:

Jungmann, F., Ziegelmayer, S., Lohoefer, F.K., Metz, S., Mueller-Leisse, C., Englmaier, M.,... Braren, R.F. (2023). Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification. European Radiology, 33(3), 1844-1851. https://doi.org/10.1007/s00330-022-09165-9

MLA:

Jungmann, Friederike, et al. "Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification." European Radiology 33.3 (2023): 1844-1851.

BibTeX: Download