Claude 3.7 Sonnet · Models · The Open-Source AI Stack

Cost

$3.00 / Mtok input

$15.00 / Mtok output

Anthropic API · as of 2026-05-21

source ↗

Speed

0 tok/sec output

0 ms TTFT

Anthropic API · as of 2026-05-21

via Artificial Analysis ↗

Why people cared

Claude 3.7 Sonnet was the first "hybrid reasoning" frontier model: standard and extended-thinking modes selectable per request, billed at the same input/output rates but with extended thinking consuming additional output tokens for the visible reasoning trace. The release in February 2025 paired the model with Claude Code, Anthropic's official agent harness, which made 3.7 Sonnet the default backend for serious coding-agent work through 2025. The SWE-Bench Verified score (70.3% at release) was the headline number: a closed reasoning model that could be asked to fix a real bug in a real repository and succeed at a rate that materially exceeded prior frontier models. The agentic story matters because it established "the model that can act in your codebase" as a distinct product category from "the model that can answer questions about your codebase", and Anthropic was the first frontier lab to commercialize that distinction. Open-weights catch-up arrived with Kimi K2 and DeepSeek's later releases, but Claude 3.7 Sonnet held the agentic-coding leadership position long enough to define what the category looked like.

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: unknown
Total params: not disclosed
Active params: not disclosed
Context window: not verified
Attention: unknown
Position encoding: unknown
Post-training: rlhf, constitutional
OSI-approved: no
Data released: no
Training code: not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro

80.3

as of 2026-05-21

source ↗

Code

SWE-Bench Verified	70.3	as of 2025-02-24	source ↗
LiveCodeBench	39.4	as of 2026-05-21	source ↗

Math

MATH	85.0	as of 2026-05-21	source ↗
AIME 2024	22.3	as of 2026-05-21	source ↗
AIME 2025	21.0	as of 2026-05-21	source ↗

Recommended use cases

code agent backend
extended-thinking math/code
SWE-Bench-style tasks
tool use

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

· Hybrid extended-thinking mode
· Claude Code agent harness

Known limitations

· Extended-thinking mode bills the reasoning trace as output tokens; long-thinking requests can be 5-10x more expensive than standard mode. source ↗

Lineage

Derived from

Claude 3.5 Sonnet 2024-06-20

Sources

Claude 3.7 Sonnet announcement (Anthropic, Feb 24 2025) ↗