Gzsl Video Classification On Vggsound Gzsl
Metrics
HM
ZSL
Results
Performance results of various models on this benchmark
| Paper Title | |||
|---|---|---|---|
| KDA | 9.78 | 8.32 | Boosting Audio-visual Zero-shot Learning with Large Language Models |
| TCaF | 8.77 | 7.41 | Temporal and cross-modal attention for audio-visual zero-shot learning |
| Hyper-multiple | 8.67 | 7.31 | Hyperbolic Audio-visual Zero-shot Learning |
| AVCA | 8.31 | 6.91 | Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language |
0 of 4 row(s) selected.