The Open-Source AI Stack
RSS
All models

Models · gemini

Gemini 2.5 Pro

Proprietary Google DeepMind · 2025-03-25 · Proprietary

The first Gemini release to clearly lead on LMArena Elo and on hard reasoning benchmarks. Native 1M-token context with reported 2M expansion in pipeline.

Cost

$1.25 / Mtok input
$10.00 / Mtok output

Google API · as of 2026-05-21

via Artificial Analysis ↗

Speed

127.3 tok/sec output
22248 ms TTFT

Google API · as of 2026-05-21

via Artificial Analysis ↗

Why people cared

Gemini 2.5 Pro was the first Gemini release to clearly lead on LMArena Elo, GPQA-Diamond, and AIME 2024 at the same time. Released in March 2025, it folded together Google's investments in long-context modeling (the 1M-token native window had been a Gemini differentiator since 1.5 Pro the previous February) and the reasoning-model wave o1 and R1 had established. The pricing structure (input under $1.25/M for prompts up to 200K, $2.50/M above; output at $10/M with extended thinking included) made it cheaper than Claude 3.7 Sonnet at the headline input rate but more expensive at high output volumes, which split the market between the two for production workloads. Google also published the Gemini 2.5 family across Pro, Flash, and Flash-Lite, with the smaller variants positioning aggressively against open-weights MoE on cost. Gemini's lasting differentiator through 2025 remained its multimodal handling (vision, video, and audio integrated more deeply than competing closed frontiers) and its native long-context performance, where benchmarks measuring needle-in-a-haystack retrieval at 1M tokens consistently favored Gemini 2.5 Pro over peers.

Architecture

tokens in Embedding vocab not disclosed × N layers Architecture not disclosed (proprietary or undocumented) Output projection tokens out
Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture
unknown
Total params
not disclosed
Active params
not disclosed
Context window
1.0M tokens
Attention
unknown
Position encoding
unknown
Post-training
rlhf
OSI-approved
no
Data released
no
Training code
not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro 86.2 as of 2026-05-21 source ↗

Code

LiveCodeBench 80.1 as of 2026-05-21 source ↗

Math

MATH 96.7 as of 2026-05-21 source ↗
AIME 2025 87.7 as of 2026-05-21 source ↗

Recommended use cases

  • long-context tasks
  • multimodal reasoning
  • general chat

Available quantizations

None. The weights are not distributed, so there are no public quantizations.

Notable innovations

  • · 1M-token native context
  • · LMArena Elo leadership at release

Known limitations

  • · Input pricing tier changes above 200K tokens; long-context use cases can cost 2x the headline rate. source ↗

Lineage

Continues the Gemini Pro line with native 1M-token context.

Sources