Models · Compare

GPT-5.5 vs DeepSeek-V4 Pro

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field	A: GPT-5.5	B: DeepSeek-V4 Pro
Released	2026-04-23	—
Developer	OpenAI	DeepSeek
Openness	Proprietary	Open
License	Proprietary	MIT
OSI-approved	no	yes
Data released	no	no
Training code	no	no
Architecture	unknown	moe
Total params	—	1.6T
Active params	—	49B
Experts	—	—
Context window	1.0M	1.0M
Attention	unknown	hybrid-csa-hca
Position enc.	unknown	unknown
Pretraining tokens	—	32.0T
Post-training	rlhf	sft, grpo, distillation
Training hardware	—	—
$/M input	$5.00	$1.74
$/M output	$30.00	$3.48
Output tok/sec	66.7	29.8

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro	—	87.5 2026-04-24
GPQA-Diamond	93.5 2026-05-21	88.8 2026-05-21

Code

HumanEval	—	76.8 2026-04-24
SWE-Bench Verified	—	80.6 2026-04-24
LiveCodeBench	—	93.5 2026-04-24

Context · A

Released April 23 2026, with GPT-5.5 and GPT-5.5 Pro hitting the API the next day. GPT-5.5 Instant replaced GPT-5.3 Instant as the ChatGPT default May 5 2026. Expanded context to 1M tokens; pricing above 272K input doubles input and 1.5x's output for the session.

Context · B

First V4-generation flagship, released April 24 2026 with a 1.6T-parameter MoE activating 49B per token. Combines Compressed Sparse Attention and Heavily Compressed Attention to cut single-token inference FLOPs to 27 percent of V3.2 and KV cache to 10 percent. 1M context is now the DeepSeek default. V4-Flash (284B / 13B active) shipped alongside.

GPT-5.5 detail → · DeepSeek-V4 Pro detail →