Models · Compare

Gemini 3 Flash vs Nemotron 3 Nano

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field	A: Gemini 3 Flash	B: Nemotron 3 Nano
Released	2025-12-17	2025-12-15
Developer	Google DeepMind	NVIDIA
Openness	Proprietary	Open weights
License	Proprietary	NVIDIA Open Model License
OSI-approved	no	no
Data released	no	no
Training code	no	no
Architecture	unknown	hybrid-mamba-transformer-moe
Total params	—	30B
Active params	—	—
Experts	—	—
Context window	1.0M	1.0M
Attention	unknown	hybrid-mamba2-gqa
Position enc.	unknown	unknown
Pretraining tokens	—	—
Post-training	rlhf	sft, rlhf
Training hardware	—	—
$/M input	$0.50	—
$/M output	$3.00	—
Output tok/sec	185.9	—

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro	88.2 2026-05-21	—
GPQA-Diamond	90.4 2025-12-17	—

Code

SWE-Bench Verified	78.0 2025-12-17	—
LiveCodeBench	79.7 2026-05-21	—

Math

AIME 2025

55.7 2026-05-21

—

Context · A

Speed tier of the Gemini 3 family, released December 17 2025. Google reported 3x the speed of Gemini 2.5 Pro while outperforming it on coding and reasoning, with 30 percent fewer tokens on average for everyday tasks. Became the default in the Gemini app and rolled into Search AI Mode at launch.

Context · B

First Nemotron 3 family member, a 31.6B-total / 3.2B-active hybrid Mamba-Transformer MoE. 52 layers (23 MoE, 23 Mamba-2, 6 GQA), 128 routed plus 1 shared expert with 6 activated per token. NVIDIA reports 3.3x Qwen3-30B-A3B throughput on a single H200 at 8K input / 16K output, with 1M-context support. Super and Ultra variants planned for H1 2026.

Gemini 3 Flash detail → · Nemotron 3 Nano detail →