Models · Compare

Gemini 2.5 Pro vs Llama-3.3-Nemotron Super 49B v1

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field	A: Gemini 2.5 Pro	B: Llama-3.3-Nemotron Super 49B v1
Released	2025-03-25	2025-03-18
Developer	Google DeepMind	NVIDIA
Openness	Proprietary	Open weights
License	Proprietary	NVIDIA Open Model License
OSI-approved	no	no
Data released	no	yes
Training code	no	no
Architecture	unknown	dense
Total params	—	49B
Active params	—	—
Experts	—	—
Context window	1.0M	131K
Attention	unknown	skip-attention
Position enc.	unknown	rope
Pretraining tokens	—	40B
Post-training	rlhf	sft, grpo
Training hardware	—	H100
$/M input	$1.25	$0.00
$/M output	$10.00	$0.00
Output tok/sec	127.3	0

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro	86.2 2026-05-21	69.8 2026-05-21
GPQA-Diamond	—	66.7 2025-03-18

Code

LiveCodeBench

80.1 2026-05-21

28.0 2026-05-21

Math

MATH	96.7 2026-05-21	96.6 2025-03-18
AIME 2024	—	19.3 2026-05-21
AIME 2025	87.7 2026-05-21	58.4 2025-03-18

Held-out / arena

IFEval

—

89.2 2025-03-18

Context · A

The first Gemini release to clearly lead on LMArena Elo and on hard reasoning benchmarks. Native 1M-token context with reported 2M expansion in pipeline.

Context · B

NVIDIA's NAS-distilled Llama 3.3 70B aimed at single-data-center-GPU throughput. Uses skip-attention and variable-FFN blocks selected per layer for the quality-versus-FLOPs tradeoff. Released March 18 2025 with the Llama-Nemotron-Post-Training-Dataset-v1 (30M samples) public.

Gemini 2.5 Pro detail → · Llama-3.3-Nemotron Super 49B v1 detail →