Poster

DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition

Caoshuo Li ⋅ Tanzhe Li ⋅ Xiaobin Hu ⋅ Donghao Luo ⋅ Taisong Jin

2025 Poster

Paper PDF [ Poster]

Abstract

Recently, Vision Graph Neural Network (ViG) has gained considerable attention in computer vision. Despite its groundbreaking innovation, Vision Graph Neural Network encounters key issues including the quadratic computational complexity caused by its K-Nearest Neighbor (KNN) graph construction and the limitation of pairwise relations of normal graphs. To address the aforementioned challenges, we propose a novel vision architecture, termed **D**ilated **V**ision **H**yper**G**raph **N**eural **N**etwork (DVHGNN), which is designed to leverage multi-scale hypergraph to *efficiently* capture high-order correlations among objects. Specifically, the proposed method tailors Clustering and **D**ilated **H**yper**G**raph **C**onstruction (DHGC) to adaptively capture multi-scale dependencies among the data samples. Furthermore, a dynamic hypergraph convolution mechanism is proposed to facilitate adaptive feature exchange and fusion at the hypergraph level. Extensive qualitative and quantitative evaluations of the benchmark image datasets demonstrate that the proposed DVHGNN significantly outperforms the state-of-the-art vision backbones. For instance, our DVHGNN-S achieves an impressive top-1 accuracy of **83.1\%** on ImageNet-1K, surpassing ViG-S by **+1.0**$\uparrow$ and ViHGNN-S by **+0.6**$\uparrow$.

Chat is not available.