Who's Waldo Image Captioning Dataset
Date
Publish URL
Paper URL
License
Other

Who's Waldo contains 270k image-text pairs and automatically annotates the alignment between the mentioned people and their corresponding visual regions.
The Who's Waldo dataset is constructed from freely licensed images and descriptions from Wikimedia Commons. Who's Waldo is a benchmark dataset for human-centric visual grounding.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.