The Open-Source AI Stack

Models · o-series

OpenAI o3

Proprietary OpenAI · 2025-04-16 · Proprietary

Released April 16, 2025 as the full-size successor to o1 in the o-series reasoning lineage, with multimodal (text plus image) input and a 200K-token context. OpenAI reported large jumps over o1 on SWE-bench Verified and Codeforces and roughly 3x the accuracy of o1 on ARC-AGI.

Cost

$2.00 / Mtok input

$8.00 / Mtok output

OpenAI API · as of 2026-05-21

Speed

88 tok/sec output

10746 ms TTFT

· as of 2026-05-21

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: unknown
Total params: not disclosed
Active params: not disclosed
Context window: 200K tokens
Attention: unknown
Position encoding: unknown
Post-training: rlhf
OSI-approved: no
Data released: no
Training code: not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro	85.3	as of 2026-05-21	source ↗
GPQA-Diamond	87.7	as of 2025-04-16	source ↗

Code

SWE-Bench Verified	71.7	as of 2025-04-16	source ↗
LiveCodeBench	80.8	as of 2026-05-21	source ↗

Math

MATH	99.2	as of 2026-05-21	source ↗
AIME 2024	90.3	as of 2026-05-21	source ↗
AIME 2025	88.3	as of 2026-05-21	source ↗

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

· Multimodal reasoning over text and images
· Step-change in SWE-bench Verified over o1

Sources