The Open-Source AI Stack
RSS
All models

Models · grok

Grok 4

Proprietary xAI · · Proprietary

xAI's flagship after Grok 3, released July 9 2025 and formally announced the next day. Grok 4 Heavy variant reported 50.7 percent on the text-only Humanity's Last Exam subset, a first for any model per xAI. A specialized coding variant followed shortly after.

Cost

$5.50 / Mtok input
$27.50 / Mtok output

· as of 2026-05-21

source ↗

Speed

0 tok/sec output
0 ms TTFT

· as of 2026-05-21

source ↗

Architecture

tokens in Embedding vocab not disclosed × N layers Architecture not disclosed (proprietary or undocumented) Output projection tokens out
Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture
unknown
Total params
not disclosed
Active params
not disclosed
Context window
not verified
Attention
unknown
Position encoding
unknown
Post-training
rlhf
OSI-approved
no
Data released
no
Training code
not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro 86.6 as of 2026-05-21 source ↗
GPQA-Diamond 87.7 as of 2026-05-21 source ↗

Code

LiveCodeBench 81.9 as of 2026-05-21 source ↗

Math

MATH 99.0 as of 2026-05-21 source ↗
AIME 2024 94.3 as of 2026-05-21 source ↗
AIME 2025 92.7 as of 2026-05-21 source ↗

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

  • · Grok 4 Heavy variant with parallel agents
  • · 256K context window
  • · First model over 50 percent on HLE text-only per xAI

Lineage

xAI's first model to crack 50 percent on HLE text-only per the lab.

Derived from

Grok 3 2025-02-17

Derivatives

Sources