HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
服务条款
隐私政策
中文
HyperAI
HyperAI超神经
Toggle Sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
算力平台
首页
SOTA
视觉指令跟随
Visual Instruction Following On Llava Bench
Visual Instruction Following On Llava Bench
评估指标
avg score
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
avg score
Paper Title
CuMo-7B
85.7
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
ShareGPT4V-13B
79.9
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
ShareGPT4V-7B
72.6
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
LLaVA-v1.5-13B
70.7
Improved Baselines with Visual Instruction Tuning
LLaVA-v1.5-7B
63.4
Improved Baselines with Visual Instruction Tuning
InstructBLIP-7B
60.9
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
InstructBLIP-13B
58.2
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
BLIP-2
38.1
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
0 of 8 row(s) selected.
Previous
Next
HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
服务条款
隐私政策
中文
HyperAI
HyperAI超神经
Toggle Sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
算力平台
首页
SOTA
视觉指令跟随
Visual Instruction Following On Llava Bench
Visual Instruction Following On Llava Bench
评估指标
avg score
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
avg score
Paper Title
CuMo-7B
85.7
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
ShareGPT4V-13B
79.9
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
ShareGPT4V-7B
72.6
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
LLaVA-v1.5-13B
70.7
Improved Baselines with Visual Instruction Tuning
LLaVA-v1.5-7B
63.4
Improved Baselines with Visual Instruction Tuning
InstructBLIP-7B
60.9
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
InstructBLIP-13B
58.2
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
BLIP-2
38.1
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
0 of 8 row(s) selected.
Previous
Next
Visual Instruction Following On Llava Bench | SOTA | HyperAI超神经