The Open-Source AI Stack
RSS
All models

Models · Compare

Gemini 2.5 Pro vs Llama-3.3-Nemotron Super 49B v1

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field A: Gemini 2.5 Pro B: Llama-3.3-Nemotron Super 49B v1
Released 2025-03-252025-03-18
Developer Google DeepMindNVIDIA
Openness ProprietaryOpen weights
License ProprietaryNVIDIA Open Model License
OSI-approved nono
Data released noyes
Training code nono
Architecture unknowndense
Total params 49B
Active params
Experts
Context window 1.0M131K
Attention unknownskip-attention
Position enc. unknownrope
Pretraining tokens 40B
Post-training rlhfsft, grpo
Training hardware H100
$/M input $1.25$0.00
$/M output $10.00$0.00
Output tok/sec 127.30

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro 86.2 2026-05-21 69.8 2026-05-21
GPQA-Diamond 66.7 2025-03-18

Code

LiveCodeBench 80.1 2026-05-21 28.0 2026-05-21

Math

MATH 96.7 2026-05-21 96.6 2025-03-18
AIME 2024 19.3 2026-05-21
AIME 2025 87.7 2026-05-21 58.4 2025-03-18

Held-out / arena

IFEval 89.2 2025-03-18

Context · A

The first Gemini release to clearly lead on LMArena Elo and on hard reasoning benchmarks. Native 1M-token context with reported 2M expansion in pipeline.

Context · B

NVIDIA's NAS-distilled Llama 3.3 70B aimed at single-data-center-GPU throughput. Uses skip-attention and variable-FFN blocks selected per layer for the quality-versus-FLOPs tradeoff. Released March 18 2025 with the Llama-Nemotron-Post-Training-Dataset-v1 (30M samples) public.

Gemini 2.5 Pro detail → · Llama-3.3-Nemotron Super 49B v1 detail →