HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Object Detection
Object Detection On Coco 2017
Object Detection On Coco 2017
Metrics
mAP
Results
Performance results of various models on this benchmark
Columns
Model Name
mAP
Paper Title
UniRepLKNet-XL++
56.4
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-L++
55.8
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-B++
54.8
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-S++
54.3
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
MixMIM-L
54.1
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
UniRepLKNet-S
53
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
MixMIM-B
52.2
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
UniRepLKNet-T
51.7
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
BiFormer-B (IN1k pretrain, MaskRCNN 12ep)
48.6
BiFormer: Vision Transformer with Bi-Level Routing Attention
DeBiFormer-B (IN1k pretrain, MaskRCNN 12ep)
48.5
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
BiFormer-S (IN1k pretrain, MaskRCNN 12ep)
47.8
BiFormer: Vision Transformer with Bi-Level Routing Attention
DeBiFormer-S (IN1k pretrain, MaskRCNN 12ep)
47.5
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
DeBiFormer-B (IN1k pretrain, Retina)
47.1
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
DeBiFormer-S (IN1k pretrain, Retina)
45.6
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
YOLO-Drone
35.45
YOLO-Drone:Airborne real-time detection of dense small objects from high-altitude perspective
DyHead (SAP)
-
Stochastic Subsampling With Average Pooling
Lpixel
-
Paint Transformer: Feed Forward Neural Painting with Stroke Prediction
MaxViT-T
-
MaxViT: Multi-Axis Vision Transformer
DAT-T++
-
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
MaxViT-S
-
MaxViT: Multi-Axis Vision Transformer
0 of 24 row(s) selected.
Previous
Next
HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Object Detection
Object Detection On Coco 2017
Object Detection On Coco 2017
Metrics
mAP
Results
Performance results of various models on this benchmark
Columns
Model Name
mAP
Paper Title
UniRepLKNet-XL++
56.4
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-L++
55.8
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-B++
54.8
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-S++
54.3
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
MixMIM-L
54.1
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
UniRepLKNet-S
53
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
MixMIM-B
52.2
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
UniRepLKNet-T
51.7
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
BiFormer-B (IN1k pretrain, MaskRCNN 12ep)
48.6
BiFormer: Vision Transformer with Bi-Level Routing Attention
DeBiFormer-B (IN1k pretrain, MaskRCNN 12ep)
48.5
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
BiFormer-S (IN1k pretrain, MaskRCNN 12ep)
47.8
BiFormer: Vision Transformer with Bi-Level Routing Attention
DeBiFormer-S (IN1k pretrain, MaskRCNN 12ep)
47.5
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
DeBiFormer-B (IN1k pretrain, Retina)
47.1
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
DeBiFormer-S (IN1k pretrain, Retina)
45.6
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
YOLO-Drone
35.45
YOLO-Drone:Airborne real-time detection of dense small objects from high-altitude perspective
DyHead (SAP)
-
Stochastic Subsampling With Average Pooling
Lpixel
-
Paint Transformer: Feed Forward Neural Painting with Stroke Prediction
MaxViT-T
-
MaxViT: Multi-Axis Vision Transformer
DAT-T++
-
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
MaxViT-S
-
MaxViT: Multi-Axis Vision Transformer
0 of 24 row(s) selected.
Previous
Next