google-research-datasets/Objectron

📊 Datasetgoogle-research-datasets

Objectron: 15K video clips with 3D bounding boxes for objects in AR.

Objectron is a large-scale dataset collected using mobile AR sessions, providing short object-centric video clips with rich 3D annotations. The dataset includes about 15K videos (4M images) covering 9 everyday object categories. Each frame is annotated with a 3D bounding box specifying position, orientation, and dimensions. Additionally, AR session metadata (camera poses, sparse point-clouds, planes) is provided, enabling 3D reconstruction and scene understanding tasks. The data is split into train/validation/test sets. The dataset aims to bridge the gap between 2D and 3D object understanding, supporting research in computer vision, augmented reality, and robotics. Released under the Apache 2.0 license, it includes baseline models and evaluation code in TensorFlow and PyTorch.

💡Highlights

├─15K videos, 4M images, 9 categories
├─3D bounding boxes + AR session metadata
└─Baseline models (TF, PyTorch) provided

🎯For

├─3D Vision Researchers
├─AR/VR Developers
└─ML Practitioners

🔗Links

└─GitHub Repository