Models · Compare

Nemotron 3 Nano vs Gemini 3 Flash

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field	A: Nemotron 3 Nano	B: Gemini 3 Flash
Released	2025-12-15	2025-12-17
Developer	NVIDIA	Google DeepMind
Openness	Open weights	Proprietary
License	NVIDIA Open Model License	Proprietary
OSI-approved	no	no
Data released	no	no
Training code	no	no
Architecture	hybrid-mamba-transformer-moe	unknown
Total params	30B	—
Active params	—	—
Experts	—	—
Context window	1.0M	1.0M
Attention	hybrid-mamba2-gqa	unknown
Position enc.	unknown	unknown
Pretraining tokens	—	—
Post-training	sft, rlhf	rlhf
Training hardware	—	—
$/M input	—	$0.50
$/M output	—	$3.00
Output tok/sec	—	185.9

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro	—	88.2 2026-05-21
GPQA-Diamond	—	90.4 2025-12-17

Code

SWE-Bench Verified	—	78.0 2025-12-17
LiveCodeBench	—	79.7 2026-05-21

Math

AIME 2025

—

55.7 2026-05-21

Context · A

First Nemotron 3 family member, a 31.6B-total / 3.2B-active hybrid Mamba-Transformer MoE. 52 layers (23 MoE, 23 Mamba-2, 6 GQA), 128 routed plus 1 shared expert with 6 activated per token. NVIDIA reports 3.3x Qwen3-30B-A3B throughput on a single H200 at 8K input / 16K output, with 1M-context support. Super and Ultra variants planned for H1 2026.

Context · B

Speed tier of the Gemini 3 family, released December 17 2025. Google reported 3x the speed of Gemini 2.5 Pro while outperforming it on coding and reasoning, with 30 percent fewer tokens on average for everyday tasks. Became the default in the Gemini app and rolled into Search AI Mode at launch.

Nemotron 3 Nano detail → · Gemini 3 Flash detail →