VidSTG Large-Scale Video Grounding Dataset
Date
Publish URL
License
Other

The VidSTG dataset is a spatio-temporal video grounding dataset built on the VidOR dataset. VidOR is a video relation dataset containing 7,000, 835, and 2,165 videos for training, validation, and testing, respectively. The goal of the spatio-temporal video grounding task is to locate the spatio-temporal part of an uncut video that matches a given sentence describing the target.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.