HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Lipreading
Lipreading On Lrw 1000
Lipreading On Lrw 1000
Metrics
Top-1 Accuracy
Results
Performance results of various models on this benchmark
Columns
Model Name
Top-1 Accuracy
Paper Title
SyncVSR (Word Boundary)
58.2
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
3D-ResNet + Bi-GRU + MixUp + Label Smooth + Cosine LR (Word Boundary)
55.7%
Learn an Effective Lip Reading Model without Pains
3D Conv + ResNet-18 + MS-TCN + Multi-Head Visual-Audio Memory
53.8
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
3D Conv + ResNet-18 + Bi-GRU + Visual-Audio Memory
50.82%
Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video
3D-ResNet + Bi-GRU + MixUp + Label Smooth + Cosine LR
48.3%
Learn an Effective Lip Reading Model without Pains
3D Conv + ResNet-18 + Bi-GRU (Face Cutout)
45.24%
Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
DFTN
41.93%
Deformation Flow Based Two-Stream Network for Lip Reading
GLMIM
38.79%
Mutual Information Maximization for Effective Lip Reading
PCPG
38.7%
Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading
0 of 9 row(s) selected.
Previous
Next
HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Lipreading
Lipreading On Lrw 1000
Lipreading On Lrw 1000
Metrics
Top-1 Accuracy
Results
Performance results of various models on this benchmark
Columns
Model Name
Top-1 Accuracy
Paper Title
SyncVSR (Word Boundary)
58.2
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
3D-ResNet + Bi-GRU + MixUp + Label Smooth + Cosine LR (Word Boundary)
55.7%
Learn an Effective Lip Reading Model without Pains
3D Conv + ResNet-18 + MS-TCN + Multi-Head Visual-Audio Memory
53.8
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
3D Conv + ResNet-18 + Bi-GRU + Visual-Audio Memory
50.82%
Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video
3D-ResNet + Bi-GRU + MixUp + Label Smooth + Cosine LR
48.3%
Learn an Effective Lip Reading Model without Pains
3D Conv + ResNet-18 + Bi-GRU (Face Cutout)
45.24%
Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
DFTN
41.93%
Deformation Flow Based Two-Stream Network for Lip Reading
GLMIM
38.79%
Mutual Information Maximization for Effective Lip Reading
PCPG
38.7%
Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading
0 of 9 row(s) selected.
Previous
Next
Lipreading On Lrw 1000 | SOTA | HyperAI