Models · Compare

GLM-4.6 vs Claude Sonnet 4.5

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field	A: GLM-4.6	B: Claude Sonnet 4.5
Released	2025-09-30	2025-09-29
Developer	Zhipu AI	Anthropic
Openness	Open	Proprietary
License	MIT	Proprietary
OSI-approved	yes	no
Data released	no	no
Training code	no	no
Architecture	moe	unknown
Total params	357B	—
Active params	—	—
Experts	—	—
Context window	128K	—
Attention	unknown	unknown
Position enc.	unknown	unknown
Pretraining tokens	—	—
Post-training	sft, rlhf	rlhf, constitutional
Training hardware	—	—
$/M input	$0.60	$3.00
$/M output	$2.20	$15.00
Output tok/sec	30.7	48.8

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro	78.4 2026-05-21	86.0 2026-05-21
GPQA-Diamond	63.2 2026-05-21	72.7 2026-05-21

Code

SWE-Bench Verified	—	77.2 2025-09-29
LiveCodeBench	56.1 2026-05-21	59.0 2026-05-21

Math

AIME 2025

44.3 2026-05-21

37.0 2026-05-21

Context · A

September 30 2025 refresh, 357B MoE with 200K context (up from 128K in 4.5) and 128K maximum output. Zhipu reports a 27 percent coding improvement over 4.5 and parity with Claude Sonnet 4 on 8 public benchmarks. MIT licensed.

Context · B

September 29 2025 incremental update of the Sonnet line at the same $3 / $15 price point as Sonnet 4. Shipped alongside Claude Code checkpoints, a native VS Code extension, context editing and memory tools for long-running agents, and the Claude Agent SDK. OSWorld score of 61.4% positioned it as the strongest computer-use model from Anthropic at launch.

GLM-4.6 detail → · Claude Sonnet 4.5 detail →