Rotation Invariant and Symmetry Aware Pixel Difference Network for Remote Sensing Object Detection
Abstract
Recent advancements in remote sensing object detection have predominantly focused on oriented bounding box design and small object feature enhancement, while often overlooking the intrinsic geometric properties of remote sensing images, such as rotation invariance and structural symmetry. Many aerial objects appear in multiple orientations and exhibit clear symmetrical patterns, which, if not explicitly modeled, can lead to detection failures and inaccurate localization under geometric variation or partial occlusion. To address this, we propose the Rotation Invariant and Symmetry Aware Pixel Difference Network (RIS-PiDiNet), which introduces a novel convolutional operator called Rotation Invariant and Symmetry Aware Pixel Difference Convolution (RIS-PDC). This operator replaces traditional convolution with a mathematically grounded formulation that encodes rotation group priors and symmetrical constraints. RIS-PDC utilizes pixel differences and symmetry-guided aggregation in the polar harmonic space, enabling the network to infer partially visible structures and deduce occluded symmetrical parts. Besides improving detection accuracy, RIS-PDC enhances model interpretability by embedding geometric principles into the network design. Feature visualizations demonstrate rotation-consistent activations and symmetry-complete responses, revealing how the network captures underlying object structure even under partial visibility or orientation changes. This yields geometrically interpretable detection decisions. To our knowledge, RIS-PiDiNet is the first remote sensing object detection framework that jointly incorporates rotation invariance and symmetry modeling within a unified architecture. Extensive evaluations on standard benchmarks validate its effectiveness, achieving state-of-the-art performance on DOTA-v1.0 (78.53\% mAP single-scale, 81.81\% multi-scale), HRSC2016 (98.60\% mAP), and DIOR-R (67.28\% mAP), all with acceptable computational overhead and no increase in parameter count.