Bhat S, Georgescu B, Mansoor A, Ghesu FC, Grbic S, Maier A (2025)
Publication Language: English
Publication Type: Conference contribution, Conference Contribution
Publication year: 2025
Publisher: Springer
Conference Proceedings Title: 10th International Conference on Computer Vision & Image Processing
Deep Learning (DL)-based Computer-Aided Diagnostic (CAD)
systems have shown great promise in supporting clinicians, but their
performance heavily depends on the availability of large, well-annotated
datasets. In medical lesion detection, the scarcity of comprehensive, stan-
dardized annotations and the heterogeneity of existing datasets limit the
development of robust foundation models. In this work, we introduce
a novel approach to building a multi-domain foundation model for le-
sion detection by integrating diverse medical image datasets annotated
for both anatomical structures and abnormalities. We first propose a
student-teacher training strategy to effectively combine datasets with
heterogeneous label spaces, mitigating catastrophic forgetting, and im-
proving feature learning. In addition, building on our previously proposed
example-based EM-DETR framework, we adapt it to jointly learn from
multiple domain-specific datasets, enabling fine-grained, example-driven
detection across modalities. Our model achieves state-of-the-art (SOTA)
results, with image-level AUC scores exceeding previous SOTA by 1%
point and on four key findings in CXRs.We achieve a leading mAP50
in CXRs and mammography, surpassing EM-DETR trained on a single
dataset by 2-3% points. This approach paves the way for scalable foun-
dation models in medical imaging that leverage heterogeneous data while
maintaining robust and generalizable performance.
APA:
Bhat, S., Georgescu, B., Mansoor, A., Ghesu, F.-C., Grbic, S., & Maier, A. (2025). Toward Foundation Detection Models via Example-Based Grounding in EM-DETR. In 10th International Conference on Computer Vision & Image Processing. Punjab, India, IN: Springer.
MLA:
Bhat, Sheethal, et al. "Toward Foundation Detection Models via Example-Based Grounding in EM-DETR." Proceedings of the Computer Vision and Image Processing, Punjab, India Springer, 2025.
BibTeX: Download