Command Palette
Search for a command to run...
RoVid-X Robot Video Generation Dataset
Date
Paper URL
License
CC BY 4.0
RoVid-X is a robot video generation dataset released in 2026 by Peking University in collaboration with ByteDance Seed. The related research paper is as follows: Rethinking Video Generation Model for the Embodied WorldIt aims to address the physical challenges faced by video generation models when generating robot videos.
This dataset contains approximately 4,000,000 video clips of robots, totaling over 10,000 hours in length, covering more than 1,300 fine-grained robot skills. The videos provide multimodal physical annotations, including RGB, depth, and optical flow information, supporting diversity across multiple robots and tasks, and covering different robot types, scenarios, and motion skills.
Dataset composition:
- 4,000,000 video clips of robots
- Multimodal physical annotation (RGB, depth, optical flow)
- More than 1,300 fine-grained robotic skills
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.