Models · Compare

OpenAI o3-mini vs DeepSeek-R1

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field	A: OpenAI o3-mini	B: DeepSeek-R1
Released	2025-01-31	—
Developer	OpenAI	DeepSeek
Openness	Proprietary	Open
License	Proprietary	MIT
OSI-approved	no	yes
Data released	no	no
Training code	no	no
Architecture	unknown	moe
Total params	—	671B
Active params	—	37B
Experts	—	—
Context window	200K	128K
Attention	unknown	mla
Position enc.	unknown	rope-yarn
Pretraining tokens	—	—
Post-training	rlhf	sft, grpo, rejection-sampling
Training hardware	—	H800
$/M input	$1.10	$1.35
$/M output	$4.40	$4.20
Output tok/sec	145.3	0

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU	—	90.8 2025-01-22
MMLU-Pro	79.1 2026-05-21	84.9 2026-05-21
GPQA-Diamond	79.7 2025-01-31	71.5 2025-01-22

Code

SWE-Bench Verified	49.3 2025-01-31	—
LiveCodeBench	71.7 2026-05-21	77.0 2026-05-21

Math

MATH	97.3 2026-05-21	97.3 2025-01-22
AIME 2024	87.3 2025-01-31	79.8 2025-01-22
AIME 2025	—	76.0 2026-05-21

Context · A

Released January 31, 2025 as a smaller, faster o-series reasoning model with three selectable effort levels (low, medium, high). Positioned by OpenAI as a specialized alternative to o1 for technical domains requiring precision and speed at a fraction of o1's cost.

Context · B

The first openly-released reasoning model competitive with OpenAI o1. The R1-Zero variant demonstrated that pure-RL post- training without SFT could elicit chain-of-thought reasoning. MIT-licensed weights, distillations into 1.5B-70B sizes.

OpenAI o3-mini detail → · DeepSeek-R1 detail →