Cost
Google API · as of 2026-05-21
Why people cared
Gemini 2.5 Pro was the first Gemini release to clearly lead on LMArena Elo, GPQA-Diamond, and AIME 2024 at the same time. Released in March 2025, it folded together Google's investments in long-context modeling (the 1M-token native window had been a Gemini differentiator since 1.5 Pro the previous February) and the reasoning-model wave o1 and R1 had established. The pricing structure (input under $1.25/M for prompts up to 200K, $2.50/M above; output at $10/M with extended thinking included) made it cheaper than Claude 3.7 Sonnet at the headline input rate but more expensive at high output volumes, which split the market between the two for production workloads. Google also published the Gemini 2.5 family across Pro, Flash, and Flash-Lite, with the smaller variants positioning aggressively against open-weights MoE on cost. Gemini's lasting differentiator through 2025 remained its multimodal handling (vision, video, and audio integrated more deeply than competing closed frontiers) and its native long-context performance, where benchmarks measuring needle-in-a-haystack retrieval at 1M tokens consistently favored Gemini 2.5 Pro over peers.
Architecture
data/models.yaml. Every label is auditable
against the model's sources.
Specs
- Architecture
- unknown
- Total params
- not disclosed
- Active params
- not disclosed
- Context window
- 1.0M tokens
- Attention
- unknown
- Position encoding
- unknown
- Post-training
- rlhf
- OSI-approved
- no
- Data released
- no
- Training code
- not released
Benchmarks
Each score carries the date it was published; we never infer or interpolate missing scores.
General reasoning
| MMLU-Pro | 86.2 | as of 2026-05-21 | source ↗ |
Code
| LiveCodeBench | 80.1 | as of 2026-05-21 | source ↗ |
Recommended use cases
- long-context tasks
- multimodal reasoning
- general chat
Available quantizations
None. The weights are not distributed, so there are no public quantizations.
Notable innovations
- · 1M-token native context
- · LMArena Elo leadership at release
Known limitations
- · Input pricing tier changes above 200K tokens; long-context use cases can cost 2x the headline rate. source ↗
Lineage
Continues the Gemini Pro line with native 1M-token context.