Frequency–spatial decoupled co-modeling transformer for fine-grained remote sensing image segmentation

Li X, Qian S, Lyu X, Song Y, Liu F, Fang Y, Xu Z, Kaup A (2026)


Publication Type: Journal article

Publication year: 2026

Journal

Book Volume: 747

Article Number: 123489

DOI: 10.1016/j.ins.2026.123489

Abstract

Fine-grained semantic segmentation of remote sensing images (RSIs) remains a long-standing challenge due to complex scene structures, diverse object scales, and blurred boundaries that hinder the precise delineation of land-cover classes. Existing Transformer-based approaches mainly focus on spatial attention while overlooking the complementary frequency-domain cues essential for capturing subtle textural and boundary details. To address this issue, we propose FDFormer, a frequency–spatial co-modeling transformer that incorporates frequency-domain representation learning into the segmentation pipeline. Specifically, the proposed frequency decoupling attention (FDA) adaptively separates features into high- and low-frequency bands to enhance structural perception and suppress redundant components. Moreover, a second-order cross-attention mechanism fuses the spatial stream with the frequency stream to strengthen global-local interactions and produce more discriminative context-aware representations. In addition, a hybrid boundary-aware loss combines pixel-wise cross-entropy with frequency-guided boundary supervision to balance region consistency and edge precision. Experiments conducted on two benchmark datasets, ISPRS Potsdam and LoveDA, demonstrate that FDFormer consistently surpasses state-of-the-art methods. Qualitative results further verify its effectiveness in preserving thin structures and intricate object contours.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Li, X., Qian, S., Lyu, X., Song, Y., Liu, F., Fang, Y.,... Kaup, A. (2026). Frequency–spatial decoupled co-modeling transformer for fine-grained remote sensing image segmentation. Information Sciences, 747. https://doi.org/10.1016/j.ins.2026.123489

MLA:

Li, Xin, et al. "Frequency–spatial decoupled co-modeling transformer for fine-grained remote sensing image segmentation." Information Sciences 747 (2026).

BibTeX: Download