The Open-Source AI Stack
RSS
All models

Models · Compare

Qwen 3 32B Instruct vs Mistral Medium 3

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field A: Qwen 3 32B Instruct B: Mistral Medium 3
Released 2025-04-282025-05-07
Developer AlibabaMistral AI
Openness OpenProprietary
License Apache-2.0Proprietary
OSI-approved yesno
Data released nono
Training code nono
Architecture denseunknown
Total params 32.8B
Active params
Experts
Context window 131K128K
Attention gqaunknown
Position enc. ropeunknown
Pretraining tokens 36.0T
Post-training sft, grpo, rlhfsft, rlhf
Training hardware
$/M input $0.15$0.40
$/M output $0.59$2.00
Output tok/sec 98.729

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro 72.7 2026-05-21 76.0 2026-05-21
GPQA-Diamond 53.5 2026-05-21 57.8 2026-05-21

Code

LiveCodeBench 28.8 2026-05-21 40.0 2026-05-21

Math

MATH 86.9 2026-05-21 90.7 2026-05-21
AIME 2024 30.3 2026-05-21 44.0 2026-05-21
AIME 2025 19.7 2026-05-21 30.3 2026-05-21

Context · A

Dense 32B variant of the Qwen 3 release, shipped alongside the MoE 235B A22B flagship and a full size ladder from 0.6B to 32B dense. All sizes ship under Apache 2.0, with the same hybrid thinking-vs-fast inference toggle as the MoE.

Context · B

Mid-tier flagship released May 7 2025 at $0.40 / $2.00 per Mtok with a 128K context window. Mistral positioned it as roughly 90 percent of Claude Sonnet 3.7 performance at a fraction of the cost, with deployment supported on self-hosted setups starting at four GPUs.

Qwen 3 32B Instruct detail → · Mistral Medium 3 detail →