Dense-Captioning Events in Videos: SYSU Submission to ActivityNet Challenge 2020
Dense-Captioning Events in Videos: SYSU Submission to ActivityNet Challenge 2020
Teng Wang Huicheng Zheng Mingjing Yu

Abstract
This technical report presents a brief description of our submission to the dense video captioning task of ActivityNet Challenge 2020. Our approach follows a two-stage pipeline: first, we extract a set of temporal event proposals; then we propose a multi-event captioning model to capture the event-level temporal relationships and effectively fuse the multi-modal information. Our approach achieves a 9.28 METEOR score on the test set.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| dense-video-captioning-on-activitynet | TSRM-CMG-HRNN+SCST | METEOR: 9.71 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.