Ego-1K – A Large-Scale Multiview Video Dataset for Egocentric Vision
Abstract
We present Ego-1K, a large-scale, time-synchronized collection of egocentric multiview videos designed to advance neural 3D video synthesis, dynamic scene understanding, and embodied perception. The dataset contains nearly 1,000 short egocentric videos taken with a custom rig with 12 synchronous cameras surrounding a VR headset worn by the user. Scene content focuses on hand motions and hand-object interactions in different settings. We describe rig design, data processing, and calibration. Our dataset enables new ways to benchmark egocentric scene reconstruction methods. We believe this is an important area of research as smart glasses with multiple cameras become omnipresent. Our experiments demonstrate that our dataset presents unique challenges for existing 3D and 4D novel view synthesis methods due to high disparities and image motion caused by close dynamic objects and rig egomotion. Our dataset supports future research in this challenging domain, enabling 4D world creation and sharing.