The Open-Source AI Stack

Models · grok

Grok 3

Proprietary xAI · 2025-02-17 · Proprietary

xAI's third-generation flagship, trained on the Colossus supercomputer (approximately 200,000 GPUs) with roughly 10x the compute of Grok 2. Released alongside a separate Grok 3 Reasoning variant and a DeepSearch product, with xAI claiming wins over GPT-4o on AIME math and GPQA science benchmarks. API access launched in April 2025.

Cost

$4.00 / Mtok input

$20.00 / Mtok output

xAI API · as of 2026-05-21

Speed

0 tok/sec output

0 ms TTFT

· as of 2026-05-21

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: unknown
Total params: not disclosed
Active params: not disclosed
Context window: 131K tokens
Attention: unknown
Position encoding: unknown
Training hardware: H100
Post-training: rlhf
OSI-approved: no
Data released: no
Training code: not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro	79.9	as of 2026-05-21	source ↗
GPQA-Diamond	69.3	as of 2026-05-21	source ↗

Code

LiveCodeBench

42.5

as of 2026-05-21

Math

MATH	87.0	as of 2026-05-21	source ↗
AIME 2024	33.0	as of 2026-05-21	source ↗
AIME 2025	58.0	as of 2026-05-21	source ↗

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

· Think mode and Big Brain mode for reasoning
· DeepSearch over web plus X
· Trained on Colossus 200K GPU cluster

Sources

Grok 3 (Wikipedia, accessed 2026-05-19) ↗