OSA: Echocardiography Video Segmentation via Orthogonalized State Update and Anatomical Prior-aware Feature Enhancement
Abstract
Accurate segmentation of cardiac chambers in echocardiography videos is essential for quantitative cardiac assessment. However, ultrasound noise, artifacts, and cardiac motion pose significant challenges to robust spatiotemporal modeling. Recent approaches such as Transformers, linear attention, and state-space models improve accuracy, yet Transformers often remain computationally expensive, whereas linear attention and state-space models typically lack geometric regularization, leading to unstable spatiotemporal interactions under complex cardiac motion. We introduce OSA, a lightweight linear sequence architecture designed for stable and efficient cardiac video segmentation. OSA incorporates an Anatomical Prior-aware Feature Enhancement (APFE) module that decouples and fuses complementary anatomical components to strengthen boundary–region discrimination. Orthogonalized State Update (OSU) enforces spectral-norm and orthogonality constraints during recurrent transitions, preserving spatiotemporal coherence. Evaluated on the CAMUS and EchoNet-Dynamic datasets, OSA consistently outperforms state-of-the-art methods in segmentation accuracy and temporal consistency, while maintaining real-time inference efficiency. This framework offers a principled and efficient solution for dynamic cardiac analysis in echocardiography. The code will be released upon publication.