Models · Compare
Qwen 2.5 72B Instruct vs Claude 3.5 Haiku
Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.
Specs
| Field | A: Qwen 2.5 72B Instruct | B: Claude 3.5 Haiku |
|---|---|---|
| Released | 2024-09-19 | 2024-11-04 |
| Developer | Alibaba | Anthropic |
| Openness | Open | Proprietary |
| License | Qwen License (Apache-2.0 for 72B variant requires agreement) | Proprietary |
| OSI-approved | no | no |
| Data released | no | no |
| Training code | no | no |
| Architecture | dense | unknown |
| Total params | 72.7B | — |
| Active params | — | — |
| Experts | — | — |
| Context window | 131K | 200K |
| Attention | gqa | unknown |
| Position enc. | rope | unknown |
| Pretraining tokens | 18.0T | — |
| Post-training | sft, dpo | rlhf, constitutional |
| Training hardware | — | — |
| $/M input | $0.36 | $0.80 |
| $/M output | $0.40 | $4.00 |
| Output tok/sec | 54.1 | 0 |
Benchmarks
Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.
General reasoning
| MMLU-Pro | 72.0 2026-05-21 | 63.4 2026-05-21 |
| GPQA-Diamond | 49.0 2024-09-19 | 40.8 2026-05-21 |
Code
| HumanEval | 86.6 2024-09-19 | — |
| SWE-Bench Verified | — | 40.6 2024-10-22 |
| LiveCodeBench | 27.6 2026-05-21 | 31.4 2026-05-21 |
Math
| MATH | 85.8 2026-05-21 | 72.1 2026-05-21 |
| AIME 2024 | 16.0 2026-05-21 | 3.3 2026-05-21 |
| AIME 2025 | 14.0 2026-05-21 | — |
Context · A
The 72B class set new open-weights leadership across most reasoning benchmarks at release, with an 18T-token pretrain. Released alongside coder, math, and 0.5B/1.5B/3B/7B/14B/32B sibling sizes the same day.
Context · B
Announced October 22 2024 alongside the upgraded Claude 3.5 Sonnet and the computer-use beta; generally available in early November. Anthropic positioned it as matching Claude 3 Opus on many intelligence benchmarks at Haiku-tier speeds, with a SWE-Bench Verified score of 40.6% making it briefly the best small-tier coding model from a major lab.