Poster
JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba
Xiaoyong Lu · Songlin Du
ExHall D Poster #409
Abstract:
Existing state-of-the-art feature matchers capture long-range dependencies with Transformers but are hindered by high spatial complexity,leading to demanding training and high-latency inference.Striking a better balance between performance and efficiency remains a critical challenge in feature matching.Inspired by the linear complexity $\mathcal{O}(N)$ of Mamba, we propose an ultra-lightweight Mamba-based matcher, named JamMa, which converges on a single GPU and achieves an impressive performance-efficiency balance in inference.To unlock the potential of Mamba for feature matching,we propose Joint Mamba with a scan-merge strategy named $\textbf{JEGO}$, which enables:(1) $\textbf{J}$oint scan of two images to achieve high-frequency mutual interaction, (2) $\textbf{E}$fficient scan with skip steps to reduce sequence length, (3) $\textbf{G}$lobal receptive field, and (4) $\textbf{O}$mnidirectional feature representation.With the above properties, the JEGO strategy significantly outperforms the scan-merge strategies proposed in VMamba and EVMamba in the feature matching task.Compared to attention-based sparse and semi-dense matchers, JamMa demonstrates a notably superior balance between performance and efficiency,delivering better performance with less than $50$% of the parameters and FLOPs.
Live content is unavailable. Log in and register to view live content