Skip to yearly menu bar Skip to main content


Learning to Select Views for Efficient Multi-View Understanding

Yunzhong Hou · Stephen Gould · Liang Zheng

Arch 4A-E Poster #58
[ ]
Fri 21 Jun 10:30 a.m. PDT — noon PDT

Abstract: Multiple camera view (multi-view) setups have proven useful in many computer vision applications. However, the high computational cost associated with multiple views creates a significant challenge for end devices with limited computational resources. In modern CPU, pipelining breaks a longer job into steps and enables parallelism over sequential steps from multiple jobs. Inspired by this, we study selective view pipelining for efficient multi-view understanding, which breaks computation of multiple views into steps, and only computes the most helpful views/steps in a parallel manner for the best efficiency. To this end, we use reinforcement learning to learn a very light view selection module that analyzes the target object or scenario from initial views and selects the next-best-view for recognition or detection for pipeline computation. Experimental results on multi-view classification and detection tasks show that our approach achieves promising performance while using only 2 or 3 out of $N$ available views, significantly reducing computational costs while maintaining parallelism over GPU through selective view pipelining

Live content is unavailable. Log in and register to view live content