Nützel F, Dombrowski MN, Kainz B (2026)
Publication Type: Conference contribution
Publication year: 2026
Publisher: Springer Science and Business Media Deutschland GmbH
Book Volume: 16241 LNCS
Pages Range: 540-550
Conference Proceedings Title: Lecture Notes in Computer Science
Event location: Daejeon, KOR
ISBN: 9783032095121
DOI: 10.1007/978-3-032-09513-8_52
Retrieval-augmented learning based on radiology reports has emerged as a promising direction to improve performance on long-tail medical imaging tasks, such as rare disease detection in chest X-rays. Most existing methods rely on comparing high-dimensional text embeddings from models like CLIP or CXR-BERT, which are often difficult to interpret, computationally expensive, and not well-aligned with the structured nature of medical knowledge. We propose a novel, ontology-driven alternative for comparing radiology report texts based on clinically grounded concepts from the Unified Medical Language System (UMLS). Our method extracts standardised medical entities from free-text reports using an enhanced pipeline built on RadGraph-XL and SapBERT. These entities are linked to UMLS concepts (CUIs), enabling a transparent, interpretable set-based representation of each report. We then define a task-adaptive similarity measure based on a modified and weighted version of the Tversky Index that accounts for synonymy, negation, and hierarchical relationships between medical entities. This allows efficient and semantically meaningful similarity comparisons between reports. We demonstrate that our approach outperforms state-of-the-art embedding-based retrieval methods in a radiograph classification task on MIMIC-CXR, particularly in long-tail settings. Additionally, we use our pipeline to generate ontology-backed disease labels for MIMIC-CXR, offering a valuable new resource for downstream learning tasks. Our work provides more explainable, reliable, and task-specific retrieval strategies in clinical AI systems, especially when interpretability and domain knowledge integration are essential. Our code is available at https://github.com/Felix-012/ontology-concept-distillation.git.
APA:
Nützel, F., Dombrowski, M.N., & Kainz, B. (2026). Ontology-Based Concept Distillation for Radiology Report Retrieval and Labeling. In Zhiming Cui, Islem Rekik, Heung-IL Suk, Xi Ouyang, Kaicong Sun, Sheng Wang (Eds.), Lecture Notes in Computer Science (pp. 540-550). Daejeon, KOR: Springer Science and Business Media Deutschland GmbH.
MLA:
Nützel, Felix, Mischa Neil Dombrowski, and Bernhard Kainz. "Ontology-Based Concept Distillation for Radiology Report Retrieval and Labeling." Proceedings of the 16th International Workshop on Machine Learning in Medical Imaging, MLMI 2025 was held in conjunction with the 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025, Daejeon, KOR Ed. Zhiming Cui, Islem Rekik, Heung-IL Suk, Xi Ouyang, Kaicong Sun, Sheng Wang, Springer Science and Business Media Deutschland GmbH, 2026. 540-550.
BibTeX: Download