HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Question Answering
Question Answering On Boolq
Question Answering On Boolq
Metrics
Accuracy
Results
Performance results of various models on this benchmark
Columns
Model Name
Accuracy
Paper Title
Mistral-Nemo 12B (HPT)
99.87
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models
Gemma-7B
99.419
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models
ST-MoE-32B 269B (fine-tuned)
92.4
ST-MoE: Designing Stable and Transferable Sparse Expert Models
PaLM 540B (fine-tuned)
92.2
PaLM: Scaling Language Modeling with Pathways
Turing NLR v5 XXL 5.4B (fine-tuned)
92
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE
T5-XXL 11B (fine-tuned)
91.2
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
PaLM 2-L (1-shot)
90.9
PaLM 2 Technical Report
UL2 20B (fine-tuned)
90.8
UL2: Unifying Language Learning Paradigms
Vega v2 6B (fine-tuned)
90.5
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE
DeBERTa-1.5B
90.4
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
ST-MoE-L 4.1B (fine-tuned)
88.6
ST-MoE: Designing Stable and Transferable Sparse Expert Models
PaLM 2-M (1-shot)
88.6
PaLM 2 Technical Report
PaLM 2-S (1-shot)
88.1
PaLM 2 Technical Report
MUPPET Roberta Large
87.5
Muppet: Massive Multi-task Representations with Pre-Finetuning
FLAN 137B (prompt-tuned)
86.3
Finetuned Language Models Are Zero-Shot Learners
RoBERTa-large 355M + Entailment as Few-shot Learner
86.0
Entailment as Few-Shot Learner
T5-Large 770M (fine-tuned)
85.4
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
LLaMA 65B (0-shot)
85.3
LLaMA: Open and Efficient Foundation Language Models
LLaMA 2 70B (0-shot)
85
Llama 2: Open Foundation and Fine-Tuned Chat Models
FLAN 137B (4-shot)
84.6
Finetuned Language Models Are Zero-Shot Learners
0 of 66 row(s) selected.
Previous
Next
HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Question Answering
Question Answering On Boolq
Question Answering On Boolq
Metrics
Accuracy
Results
Performance results of various models on this benchmark
Columns
Model Name
Accuracy
Paper Title
Mistral-Nemo 12B (HPT)
99.87
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models
Gemma-7B
99.419
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models
ST-MoE-32B 269B (fine-tuned)
92.4
ST-MoE: Designing Stable and Transferable Sparse Expert Models
PaLM 540B (fine-tuned)
92.2
PaLM: Scaling Language Modeling with Pathways
Turing NLR v5 XXL 5.4B (fine-tuned)
92
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE
T5-XXL 11B (fine-tuned)
91.2
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
PaLM 2-L (1-shot)
90.9
PaLM 2 Technical Report
UL2 20B (fine-tuned)
90.8
UL2: Unifying Language Learning Paradigms
Vega v2 6B (fine-tuned)
90.5
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE
DeBERTa-1.5B
90.4
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
ST-MoE-L 4.1B (fine-tuned)
88.6
ST-MoE: Designing Stable and Transferable Sparse Expert Models
PaLM 2-M (1-shot)
88.6
PaLM 2 Technical Report
PaLM 2-S (1-shot)
88.1
PaLM 2 Technical Report
MUPPET Roberta Large
87.5
Muppet: Massive Multi-task Representations with Pre-Finetuning
FLAN 137B (prompt-tuned)
86.3
Finetuned Language Models Are Zero-Shot Learners
RoBERTa-large 355M + Entailment as Few-shot Learner
86.0
Entailment as Few-Shot Learner
T5-Large 770M (fine-tuned)
85.4
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
LLaMA 65B (0-shot)
85.3
LLaMA: Open and Efficient Foundation Language Models
LLaMA 2 70B (0-shot)
85
Llama 2: Open Foundation and Fine-Tuned Chat Models
FLAN 137B (4-shot)
84.6
Finetuned Language Models Are Zero-Shot Learners
0 of 66 row(s) selected.
Previous
Next
Question Answering On Boolq | SOTA | HyperAI