Models · Compare

Gemma 4 26B-A4B vs Claude Opus 4.7

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field	A: Gemma 4 26B-A4B	B: Claude Opus 4.7
Released	2026-04-02	2026-04-16
Developer	Google DeepMind	Anthropic
Openness	Open weights	Proprietary
License	Apache-2.0	Proprietary
OSI-approved	yes	no
Data released	no	no
Training code	no	no
Architecture	moe	unknown
Total params	26B	—
Active params	3.8B	—
Experts	—	—
Context window	262K	1.0M
Attention	gqa	unknown
Position enc.	rope	unknown
Pretraining tokens	—	—
Post-training	sft, rlhf	rlhf, constitutional
Training hardware	—	—
$/M input	—	$5.00
$/M output	—	$25.00
Output tok/sec	—	48.6

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

GPQA-Diamond

—

91.4 2026-05-21

Context · A

The sparse mixture-of-experts sibling in the Gemma 4 family: 26B total parameters but only about 3.8B activated per token. Because single-stream decode is memory-bandwidth-bound, the small active count makes it decode far faster than the dense 31B at the same quant while keeping a large memory footprint (all experts resident), which makes it a strong general-purpose local model.

Context · B

First Claude with high-resolution image input, accepting up to 2576 pixels on the long edge (about 3.75 megapixels, more than triple prior Claudes). Released April 16 2026 at unchanged Opus 4.6 pricing of $5 / $25 per Mtok with the 1M context standard. Adds a new xhigh effort level for finer reasoning-versus-latency control.

Gemma 4 26B-A4B detail → · Claude Opus 4.7 detail →