Models · Compare
Gemini 2.5 Pro vs Llama-3.3-Nemotron Super 49B v1
Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.
Specs
| Field | A: Gemini 2.5 Pro | B: Llama-3.3-Nemotron Super 49B v1 |
|---|---|---|
| Released | 2025-03-25 | 2025-03-18 |
| Developer | Google DeepMind | NVIDIA |
| Openness | Proprietary | Open weights |
| License | Proprietary | NVIDIA Open Model License |
| OSI-approved | no | no |
| Data released | no | yes |
| Training code | no | no |
| Architecture | unknown | dense |
| Total params | — | 49B |
| Active params | — | — |
| Experts | — | — |
| Context window | 1.0M | 131K |
| Attention | unknown | skip-attention |
| Position enc. | unknown | rope |
| Pretraining tokens | — | 40B |
| Post-training | rlhf | sft, grpo |
| Training hardware | — | H100 |
| $/M input | $1.25 | $0.00 |
| $/M output | $10.00 | $0.00 |
| Output tok/sec | 127.3 | 0 |
Benchmarks
Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.
General reasoning
| MMLU-Pro | 86.2 2026-05-21 | 69.8 2026-05-21 |
| GPQA-Diamond | — | 66.7 2025-03-18 |
Code
| LiveCodeBench | 80.1 2026-05-21 | 28.0 2026-05-21 |
Math
| MATH | 96.7 2026-05-21 | 96.6 2025-03-18 |
| AIME 2024 | — | 19.3 2026-05-21 |
| AIME 2025 | 87.7 2026-05-21 | 58.4 2025-03-18 |
Held-out / arena
| IFEval | — | 89.2 2025-03-18 |
Context · A
The first Gemini release to clearly lead on LMArena Elo and on hard reasoning benchmarks. Native 1M-token context with reported 2M expansion in pipeline.
Context · B
NVIDIA's NAS-distilled Llama 3.3 70B aimed at single-data-center-GPU throughput. Uses skip-attention and variable-FFN blocks selected per layer for the quality-versus-FLOPs tradeoff. Released March 18 2025 with the Llama-Nemotron-Post-Training-Dataset-v1 (30M samples) public.
Gemini 2.5 Pro detail → · Llama-3.3-Nemotron Super 49B v1 detail →