The Open-Source AI Stack
RSS
All models

Models · Compare

Nemotron 3 Nano vs Gemini 3 Flash

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field A: Nemotron 3 Nano B: Gemini 3 Flash
Released 2025-12-152025-12-17
Developer NVIDIAGoogle DeepMind
Openness Open weightsProprietary
License NVIDIA Open Model LicenseProprietary
OSI-approved nono
Data released nono
Training code nono
Architecture hybrid-mamba-transformer-moeunknown
Total params 30B
Active params
Experts
Context window 1.0M1.0M
Attention hybrid-mamba2-gqaunknown
Position enc. unknownunknown
Pretraining tokens
Post-training sft, rlhfrlhf
Training hardware
$/M input $0.50
$/M output $3.00
Output tok/sec 185.9

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro 88.2 2026-05-21
GPQA-Diamond 90.4 2025-12-17

Code

SWE-Bench Verified 78.0 2025-12-17
LiveCodeBench 79.7 2026-05-21

Math

AIME 2025 55.7 2026-05-21

Context · A

First Nemotron 3 family member, a 31.6B-total / 3.2B-active hybrid Mamba-Transformer MoE. 52 layers (23 MoE, 23 Mamba-2, 6 GQA), 128 routed plus 1 shared expert with 6 activated per token. NVIDIA reports 3.3x Qwen3-30B-A3B throughput on a single H200 at 8K input / 16K output, with 1M-context support. Super and Ultra variants planned for H1 2026.

Context · B

Speed tier of the Gemini 3 family, released December 17 2025. Google reported 3x the speed of Gemini 2.5 Pro while outperforming it on coding and reasoning, with 30 percent fewer tokens on average for everyday tasks. Became the default in the Gemini app and rolled into Search AI Mode at launch.

Nemotron 3 Nano detail → · Gemini 3 Flash detail →