4 months ago

Claude 3.5 Sonnet Model Card Addendum

{Anthropic}

Abstract

This addendum to our Claude 3 Model Card describes Claude 3.5 Sonnet, a new model which outperformsour previous most capable model, Claude 3 Opus, while operating faster and at a lower cost. Claude 3.5Sonnet offers improved capabilities, including better coding and visual processing. Since it is an evolution ofthe Claude 3 model family, we are providing an addendum rather than a new model card. We provide updatedkey evaluations and results from our safety testing.

Benchmarks

Benchmark	Methodology	Metrics
code-generation-on-humaneval	GPT-4o (0-shot)	Pass@1: 90.2
mmr-total-on-mrr-benchmark	Claude 3.5 Sonnet	Total Column Score: 463
multi-task-language-understanding-on-mmlu	Claude 3.5 Sonnet (5-shot)	Average (%): 88.7
question-answering-on-newsqa	Anthropic/claude-3-5-sonnet	EM: 74.23 F1: 82.3
visual-question-answering-on-mm-vet	Claude 3.5 Sonnet (claude-3-5-sonnet-20240620)	GPT-4 score: 74.2±0.2
visual-question-answering-on-mm-vet-v2	Claude 3.5 Sonnet (claude-3-5-sonnet-20240620)	GPT-4 score: 71.8±0.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Console

4 months ago

Claude 3.5 Sonnet Model Card Addendum

View Paper Details

{Anthropic}

Abstract

Benchmarks

Benchmark	Methodology	Metrics
code-generation-on-humaneval	GPT-4o (0-shot)	Pass@1: 90.2
mmr-total-on-mrr-benchmark	Claude 3.5 Sonnet	Total Column Score: 463
multi-task-language-understanding-on-mmlu	Claude 3.5 Sonnet (5-shot)	Average (%): 88.7
question-answering-on-newsqa	Anthropic/claude-3-5-sonnet	EM: 74.23 F1: 82.3
visual-question-answering-on-mm-vet	Claude 3.5 Sonnet (claude-3-5-sonnet-20240620)	GPT-4 score: 74.2±0.2
visual-question-answering-on-mm-vet-v2	Claude 3.5 Sonnet (claude-3-5-sonnet-20240620)	GPT-4 score: 71.8±0.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Claude 3.5 Sonnet Model Card Addendum

{Anthropic}

Abstract

Benchmarks

Build AI with AI

Hyper Newsletters

Command Palette

Claude 3.5 Sonnet Model Card Addendum

{Anthropic}

Abstract

Benchmarks

Build AI with AI

Hyper Newsletters