The Open-Source AI Stack
RSS
All models

Models · Compare

Mistral Large 2 vs Claude 3.5 Sonnet

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field A: Mistral Large 2 B: Claude 3.5 Sonnet
Released 2024-07-242024-06-20
Developer Mistral AIAnthropic
Openness Source-availableProprietary
License Mistral Research LicenseProprietary
OSI-approved nono
Data released nono
Training code nono
Architecture denseunknown
Total params 123B
Active params
Experts
Context window 131K200K
Attention gqaunknown
Position enc. ropeunknown
Pretraining tokens
Post-training sft, dporlhf, constitutional
Training hardware
$/M input $2.00
$/M output $6.00
Output tok/sec 31.7

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU 84.0 2024-07-24
MMLU-Pro 69.7 2026-05-21
GPQA-Diamond 48.6 2026-05-21

Code

HumanEval 92.0 2024-07-24
LiveCodeBench 29.3 2026-05-21

Math

MATH 71.5 2024-07-24
AIME 2024 11.0 2026-05-21
AIME 2025 14.0 2026-05-21

Context · A

Mistral's frontier dense model from July 2024, sized for single-node inference at 123B parameters with a 128K context. Weights are downloadable under the Mistral Research License for non-commercial use, with a separate paid Mistral Commercial License required for production deployment. Trained with explicit emphasis on reducing hallucinations and supporting parallel and sequential function calling across dozens of natural and coding languages.

Context · B

The first Claude release to beat its own larger sibling (Claude 3 Opus) on most benchmarks. Established Artifacts, driving a wave of code-and-canvas product copies.

Mistral Large 2 detail → · Claude 3.5 Sonnet detail →