Models · Compare
DeepSeek-V3 vs Gemini 2.0 Flash
Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.
Specs
| Field | A: DeepSeek-V3 | B: Gemini 2.0 Flash |
|---|---|---|
| Released | — | 2024-12-11 |
| Developer | DeepSeek | Google DeepMind |
| Openness | Open | Proprietary |
| License | DeepSeek License | Proprietary |
| OSI-approved | no | no |
| Data released | no | no |
| Training code | no | no |
| Architecture | moe | unknown |
| Total params | 671B | — |
| Active params | 37B | — |
| Experts | 256 (8 active) | — |
| Context window | 128K | 1.0M |
| Attention | mla | unknown |
| Position enc. | rope-yarn | unknown |
| Pretraining tokens | 14.8T | — |
| Post-training | sft, grpo | rlhf |
| Training hardware | H800 | — |
| $/M input | $0.40 | $0.15 |
| $/M output | $0.89 | $0.60 |
| Output tok/sec | 0 | 0 |
Benchmarks
Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.
General reasoning
| MMLU | 87.1 2024-12-27 | — |
| MMLU-Pro | 75.2 2026-05-21 | 77.9 2026-05-21 |
| GPQA-Diamond | 59.1 2024-12-27 | 62.3 2026-05-21 |
Code
| HumanEval | 65.2 2024-12-27 | — |
| LiveCodeBench | 35.9 2026-05-21 | 33.4 2026-05-21 |
Math
| MATH | 90.2 2024-12-27 | 93.0 2026-05-21 |
| AIME 2024 | 25.3 2026-05-21 | 33.0 2026-05-21 |
| AIME 2025 | 26.0 2026-05-21 | 21.7 2026-05-21 |
Context · A
The cost-quality reset. Pretrained for a reported $5.6M on H800 GPUs (US-export-constrained silicon) and matched closed-frontier benchmarks. Triggered the "DeepSeek moment" in January 2025 when US markets re-priced AI capex assumptions.
Context · B
Google's workhorse Gemini 2.0 checkpoint, released initially as an experimental developer preview. Google positioned it as outperforming Gemini 1.5 Pro on key benchmarks at roughly twice the speed, with native multimodal output (image generation, text-to-speech) and native tool use including Google Search and code execution. The Multimodal Live API added real-time audio and video streaming.