OpenAI o1 · Models · The Open-Source AI Stack

Cost

$15.00 / Mtok input

$60.00 / Mtok output

OpenAI API · as of 2026-05-21

via Artificial Analysis ↗

Speed

75.8 tok/sec output

25278 ms TTFT

OpenAI API · as of 2026-05-21

via Artificial Analysis ↗

Why people cared

OpenAI o1 was the first publicly available frontier reasoning model and the existence proof for spending extra inference-time compute on a private chain of thought before answering. The September 2024 preview release and December 2024 GA established the template: the model produces a hidden reasoning trace (billed at output rates) before the user-visible answer, with benchmark scores on GPQA-Diamond and AIME that materially exceeded GPT-4o on the same architecture-and-data class. The pricing structure was new to the market: at $60 per million output tokens with reasoning traces consuming most of the output budget, a single hard problem could cost dollars rather than fractions of a cent. That created two follow-on stories. First, the open community responded with DeepSeek R1 four months later under MIT license, demonstrating that the reasoning recipe was within reach of organizations not at OpenAI's scale. Second, the reasoning-vs-cost framing made "thinking budget" a first-class deployment knob: subsequent OpenAI releases (o1-mini, o3, o3-mini) and competitor responses (Claude 3.7 extended thinking, Gemini 2.5 thinking) all let the developer dial how much inference compute to spend per request.

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: unknown
Total params: not disclosed
Active params: not disclosed
Context window: not verified
Attention: unknown
Position encoding: unknown
Post-training: rlhf
OSI-approved: no
Data released: no
Training code: not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro	84.1	as of 2026-05-21	source ↗
GPQA-Diamond	77.3	as of 2024-12-05	source ↗

Code

LiveCodeBench

67.9

as of 2026-05-21

source ↗

Math

MATH	97.0	as of 2026-05-21	source ↗
AIME 2024	83.3	as of 2024-12-05	source ↗

Recommended use cases

math reasoning
code reasoning
complex multi-step problems

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

· Inference-time reasoning compute
· Chain-of-thought as a training target

Known limitations

· Reasoning traces are billed as output tokens but not visible to the user; cost-per-problem can be hard to predict. source ↗

Lineage

First public reasoning model from OpenAI.

Sources

OpenAI o1 announcement (Sep 12 2024 preview, Dec 5 2024 GA) ↗