HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation

Heidari M, Kazerouni A, Soltany M, Azad R, Aghdam EK, Cohen-Adad J, Merhof D (2023)


Publication Type: Conference contribution

Publication year: 2023

Publisher: Institute of Electrical and Electronics Engineers Inc.

Pages Range: 6191-6201

Conference Proceedings Title: Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023

Event location: Waikoloa, HI, USA

ISBN: 9781665493468

DOI: 10.1109/WACV56688.2023.00614

Abstract

Convolutional neural networks (CNNs) have been the consensus for medical image segmentation tasks. However, they suffer from the limitation in modeling long-range dependencies and spatial correlations due to the nature of convolution operation. Although transformers were first developed to address this issue, they fail to capture low-level features. In contrast, it is demonstrated that both local and global features are crucial for dense prediction, such as segmenting in challenging contexts. In this paper, we propose HiFormer, a novel method that efficiently bridges a CNN and a transformer for medical image segmentation. Specifically, we design two multi-scale feature representations using the seminal Swin Transformer module and a CNN-based encoder. To secure a fine fusion of global and local features obtained from the two aforementioned representations, we propose a Double-Level Fusion (DLF) module in the skip connection of the encoder-decoder structure. Extensive experiments on various medical image segmentation datasets demonstrate the effectiveness of HiFormer over other CNN-based, transformer-based, and hybrid methods in terms of computational complexity, quantitative and qualitative results. Our code is publicly available at GitHub.

Involved external institutions

How to cite

APA:

Heidari, M., Kazerouni, A., Soltany, M., Azad, R., Aghdam, E.K., Cohen-Adad, J., & Merhof, D. (2023). HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation. In Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023 (pp. 6191-6201). Waikoloa, HI, USA: Institute of Electrical and Electronics Engineers Inc..

MLA:

Heidari, Moein, et al. "HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation." Proceedings of the 23rd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2023, Waikoloa, HI, USA Institute of Electrical and Electronics Engineers Inc., 2023. 6191-6201.

BibTeX: Download