The Open-Source AI Stack
RSS
All models

Models · Compare

Grok 4 vs Kimi K2 Instruct

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field A: Grok 4 B: Kimi K2 Instruct
Released 2025-07-11
Developer xAIMoonshot AI
Openness ProprietaryOpen
License ProprietaryModified MIT
OSI-approved nono
Data released nono
Training code nono
Architecture unknownmoe
Total params 1T
Active params 32B
Experts 384 (8 active)
Context window 128K
Attention unknownmla
Position enc. unknownrope
Pretraining tokens 15.5T
Post-training rlhfsft, rlhf
Training hardware
$/M input $5.50
$/M output $27.50
Output tok/sec 0

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro 86.6 2026-05-21 81.1 2025-07-15
GPQA-Diamond 87.7 2026-05-21 75.1 2025-07-15

Code

SWE-Bench Verified 65.8 2025-07-15
LiveCodeBench 81.9 2026-05-21

Math

MATH 99.0 2026-05-21
AIME 2024 94.3 2026-05-21
AIME 2025 92.7 2026-05-21

Context · A

xAI's flagship after Grok 3, released July 9 2025 and formally announced the next day. Grok 4 Heavy variant reported 50.7 percent on the text-only Humanity's Last Exam subset, a first for any model per xAI. A specialized coding variant followed shortly after.

Context · B

A trillion-parameter open-weights MoE optimized for agentic tool-use, with strong SWE-Bench results making it a viable open alternative to closed coding agents at release.

Grok 4 detail → · Kimi K2 Instruct detail →