CrackSSM: Reviving SSMs for Crack Segmentation via Dynamic Scanning
Abstract
Crack segmentation (CS) is crucial for structural inspection and maintenance in production scenarios. To achieve both high accuracy and efficiency, recent methods have adopted Mamba-based architectures built upon state space models (SSMs), which enable linear-complexity modeling of long-range dependencies. However, existing approaches typically rely on static multi-directional scanning to flatten visual features into sequences. This fixed flattening order disrupts spatial continuity and weakens the SSM’s ability to model irregular crack patterns effectively. To address this limitation, we propose \textbf{CrackSSM}, a novel crack-aware segmentation framework featuring a dynamic scanning strategy that adapts the token sequence to the underlying structure of each image. Specifically, we compute directional response strength along four orientations from high-level semantic features, and use these values to reorder tokens so that crack-relevant regions remain adjacent in sequence. This alignment improves the causal modeling ability of SSMs while preserving their efficiency and better suits the irregular, fine-grained nature of cracks. Additionally, we design a wavelet-guided decoding mechanism to recover detailed features. It incorporates high-frequency components extracted from the input image and applies them to guide feature refinement and edge-aware fusion, further enhancing segmentation precision. Experiments on three benchmark datasets demonstrate that our method achieves superior segmentation accuracy with fewer parameters and faster inference compared to existing state-of-the-art models. Source code is available in supplementary materials.