LLM-driven Baselines for Medical Image Segmentation: A Systematic Analysis

Arjomandi J, Neubig L, Kist A (2025)


Publication Language: English

Publication Type: Conference contribution

Publication year: 2025

Journal

Publisher: Springer

Series: Informatik aktuell

City/Town: Cham

Pages Range: 50-56

Conference Proceedings Title: Bildverarbeitung für die Medizin 2025. Proceedings, German Conference on Medical Image Computing, Regensburg March 09-11, 2025

Event location: Regensburg DE

ISBN: 9783658474218

DOI: 10.1007/978-3-658-47422-5_13

Abstract

Large Language Models (LLMs) are increasingly utilized in tasks such as code generation for medical image analysis. However, their specific influence on study outcomes remains under-explored. In this study, we provide a comprehensive evaluation of various open- and closed-source LLMs, comparing their performance across medical imaging datasets. Each LLM was tasked with generating code for a U-Net-based baseline for a semantic segmentation task, guided by a tailored prompt.We evaluated each LLM’s generated model performance using the Dice coefficient and recorded all interactions with the LLM. Significant variations in baseline performance were observed among the LLMs, with differences of up to Δ 85.49% for the Bolus, 86.33% for the BAGLS, and 87.32% for the Brain Tumor test dataset. Additionally, we identified LLMs with minimal coding errors (best-performing LLMs: GPT o1 Preview and Claude 3.5 Sonnet with zero errors upon initial code execution; least-performing: Gemini 1.5 Pro and LlAMA 3.1 405B with 15 and 11 errors, respectively). In summary, careful selection of LLMs can significantly enhance medical image analysis code generation and establish reliable baselines for further algorithmic development.

Authors with CRIS profile

How to cite

APA:

Arjomandi, J., Neubig, L., & Kist, A. (2025). LLM-driven Baselines for Medical Image Segmentation: A Systematic Analysis. In Christoph Palm, Katharina Breininger, Thomas Deserno, Heinz Handels, Andreas Maier, Klaus H. Maier-Hein, Thomas M. Tolxdorff (Eds.), Bildverarbeitung für die Medizin 2025. Proceedings, German Conference on Medical Image Computing, Regensburg March 09-11, 2025 (pp. 50-56). Regensburg, DE: Cham: Springer.

MLA:

Arjomandi, Jasmin, Luisa Neubig, and Andreas Kist. "LLM-driven Baselines for Medical Image Segmentation: A Systematic Analysis." Proceedings of the German Conference on Medical Image Computing, 2025, Regensburg Ed. Christoph Palm, Katharina Breininger, Thomas Deserno, Heinz Handels, Andreas Maier, Klaus H. Maier-Hein, Thomas M. Tolxdorff, Cham: Springer, 2025. 50-56.

BibTeX: Download