{Anthropic}

Abstract
This addendum to our Claude 3 Model Card describes Claude 3.5 Sonnet, a new model which outperformsour previous most capable model, Claude 3 Opus, while operating faster and at a lower cost. Claude 3.5Sonnet offers improved capabilities, including better coding and visual processing. Since it is an evolution ofthe Claude 3 model family, we are providing an addendum rather than a new model card. We provide updatedkey evaluations and results from our safety testing.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| code-generation-on-humaneval | GPT-4o (0-shot) | Pass@1: 90.2 |
| mmr-total-on-mrr-benchmark | Claude 3.5 Sonnet | Total Column Score: 463 |
| multi-task-language-understanding-on-mmlu | Claude 3.5 Sonnet (5-shot) | Average (%): 88.7 |
| question-answering-on-newsqa | Anthropic/claude-3-5-sonnet | EM: 74.23 F1: 82.3 |
| visual-question-answering-on-mm-vet | Claude 3.5 Sonnet (claude-3-5-sonnet-20240620) | GPT-4 score: 74.2±0.2 |
| visual-question-answering-on-mm-vet-v2 | Claude 3.5 Sonnet (claude-3-5-sonnet-20240620) | GPT-4 score: 71.8±0.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.