The Open-Source AI Stack
RSS
All models

Models · o-series

OpenAI o1

Proprietary OpenAI · · Proprietary

The first publicly available frontier reasoning model. Trained to spend extra inference compute on a "private chain of thought" before answering, setting the template the open community would chase with R1.

Cost

$15.00 / Mtok input
$60.00 / Mtok output

OpenAI API · as of 2026-05-21

via Artificial Analysis ↗

Speed

75.8 tok/sec output
25278 ms TTFT

OpenAI API · as of 2026-05-21

via Artificial Analysis ↗

Why people cared

OpenAI o1 was the first publicly available frontier reasoning model and the existence proof for spending extra inference-time compute on a private chain of thought before answering. The September 2024 preview release and December 2024 GA established the template: the model produces a hidden reasoning trace (billed at output rates) before the user-visible answer, with benchmark scores on GPQA-Diamond and AIME that materially exceeded GPT-4o on the same architecture-and-data class. The pricing structure was new to the market: at $60 per million output tokens with reasoning traces consuming most of the output budget, a single hard problem could cost dollars rather than fractions of a cent. That created two follow-on stories. First, the open community responded with DeepSeek R1 four months later under MIT license, demonstrating that the reasoning recipe was within reach of organizations not at OpenAI's scale. Second, the reasoning-vs-cost framing made "thinking budget" a first-class deployment knob: subsequent OpenAI releases (o1-mini, o3, o3-mini) and competitor responses (Claude 3.7 extended thinking, Gemini 2.5 thinking) all let the developer dial how much inference compute to spend per request.

Architecture

tokens in Embedding vocab not disclosed × N layers Architecture not disclosed (proprietary or undocumented) Output projection tokens out
Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture
unknown
Total params
not disclosed
Active params
not disclosed
Context window
not verified
Attention
unknown
Position encoding
unknown
Post-training
rlhf
OSI-approved
no
Data released
no
Training code
not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro 84.1 as of 2026-05-21 source ↗
GPQA-Diamond 77.3 as of 2024-12-05 source ↗

Code

LiveCodeBench 67.9 as of 2026-05-21 source ↗

Math

MATH 97.0 as of 2026-05-21 source ↗
AIME 2024 83.3 as of 2024-12-05 source ↗

Recommended use cases

  • math reasoning
  • code reasoning
  • complex multi-step problems

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

  • · Inference-time reasoning compute
  • · Chain-of-thought as a training target

Known limitations

  • · Reasoning traces are billed as output tokens but not visible to the user; cost-per-problem can be hard to predict. source ↗

Lineage

First public reasoning model from OpenAI.

Sources