Date

4 years ago

Organization

Publish URL

workshop.colips.org

Paper URL

arxiv.org

Tags

Video Captioning

AVSD stands for The Audio Visual Scene-Aware Dialog (or DSTC7 Track 3) is an audio-visual dataset for understanding dialogue. The dataset aims to build a system and respond to the dialogue in the input video.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.