The Open-Source AI Stack

Models · grok

Grok 4

Proprietary xAI · — · Proprietary

xAI's flagship after Grok 3, released July 9 2025 and formally announced the next day. Grok 4 Heavy variant reported 50.7 percent on the text-only Humanity's Last Exam subset, a first for any model per xAI. A specialized coding variant followed shortly after.

Cost

$5.50 / Mtok input

$27.50 / Mtok output

· as of 2026-05-21

Speed

0 tok/sec output

0 ms TTFT

· as of 2026-05-21

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: unknown
Total params: not disclosed
Active params: not disclosed
Context window: not verified
Attention: unknown
Position encoding: unknown
Post-training: rlhf
OSI-approved: no
Data released: no
Training code: not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro	86.6	as of 2026-05-21	source ↗
GPQA-Diamond	87.7	as of 2026-05-21	source ↗

Code

LiveCodeBench

81.9

as of 2026-05-21

Math

MATH	99.0	as of 2026-05-21	source ↗
AIME 2024	94.3	as of 2026-05-21	source ↗
AIME 2025	92.7	as of 2026-05-21	source ↗

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

· Grok 4 Heavy variant with parallel agents
· 256K context window
· First model over 50 percent on HLE text-only per xAI

Lineage

xAI's first model to crack 50 percent on HLE text-only per the lab.

Derived from

Grok 3 2025-02-17

Derivatives

Grok 4.1 2025-11-17

Sources