The Open-Source AI Stack

Models · gemini-3

Gemini 3 Flash

Proprietary Google DeepMind · 2025-12-17 · Proprietary

Speed tier of the Gemini 3 family, released December 17 2025. Google reported 3x the speed of Gemini 2.5 Pro while outperforming it on coding and reasoning, with 30 percent fewer tokens on average for everyday tasks. Became the default in the Gemini app and rolled into Search AI Mode at launch.

Cost

$0.50 / Mtok input

$3.00 / Mtok output

Google Gemini API · as of 2026-05-21

Speed

185.9 tok/sec output

1022 ms TTFT

· as of 2026-05-21

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: unknown
Total params: not disclosed
Active params: not disclosed
Context window: 1.0M tokens
Attention: unknown
Position encoding: unknown
Post-training: rlhf
OSI-approved: no
Data released: no
Training code: not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro	88.2	as of 2026-05-21	source ↗
GPQA-Diamond	90.4	as of 2025-12-17	source ↗

Code

SWE-Bench Verified	78.0	as of 2025-12-17	source ↗
LiveCodeBench	79.7	as of 2026-05-21	source ↗

Math

AIME 2025

55.7

as of 2026-05-21

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

· Pro-grade reasoning at Flash latency
· 30 percent token reduction on average tasks
· Day-one default in Gemini app and Search AI Mode

Lineage

Speed-optimized Gemini 3 sibling at one-tenth Pro pricing.

Derived from

Gemini 3 Pro 2025-11-18

Sources