Structure-Aware Representation Distillation for Tiny-Dense Object Segmentation
Xuesong Liu ⋅ Anke Xu ⋅ Wenbo Cao ⋅ Emmett Ientilucci
Abstract
Dense scenes containing numerous tiny objects pose a fundamental challenge for segmentation models, where small localization errors can significantly degrade downstream measurements. We present Structure-Aware Representation Distillation (SARD), a teacher-compatible framework that transfers structural knowledge from a large teacher to a compact student via feature-space alignment rather than mask imitation. SARD constructs a structure-importance map that combines boundary salience, local density, and teacher confidence, and uses it to weight a unified representation loss integrating feature consistency, distribution alignment, and structural contrast. This encourages the student to allocate capacity to geometrically informative regions while preserving global context. Experiments on Cityscapes, ADE20K, and a challenging rock fragmentation benchmark (RockFrag) show that SARD consistently improves both mIoU and boundary IoU over strong distillation baselines; on RockFrag, SARD improves a Swin-T student over CWD by +4.3 mIoU and +6.7 bIoU. A ResNet-50 student distilled from a Swin-L teacher achieves up to 7.7$\times$ parameter reduction and 9$\times$ higher throughput than the teacher, with no additional inference overhead beyond the student network, demonstrating that structure-aware representation distillation is effective and efficient for tiny-dense segmentation.
Successful Page Load