Models · Compare
Gemini 1.5 Pro vs StarCoder 2 15B
Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.
Specs
| Field | A: Gemini 1.5 Pro | B: StarCoder 2 15B |
|---|---|---|
| Released | 2024-02-15 | 2024-02-28 |
| Developer | Google DeepMind | BigCode |
| Openness | Proprietary | Source-available |
| License | Proprietary | BigCode OpenRAIL-M v1 |
| OSI-approved | no | no |
| Data released | no | yes |
| Training code | no | yes |
| Architecture | unknown | dense |
| Total params | — | 15B |
| Active params | — | — |
| Experts | — | — |
| Context window | 2.1M | 16K |
| Attention | unknown | hybrid-gqa-sliding |
| Position enc. | unknown | rope |
| Pretraining tokens | — | 4.0T |
| Post-training | rlhf | — |
| Training hardware | — | H100 |
| $/M input | $0.00 | — |
| $/M output | $0.00 | — |
| Output tok/sec | 0 | — |
Benchmarks
Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.
General reasoning
| MMLU-Pro | 75.0 2026-05-21 | — |
| GPQA-Diamond | 58.9 2026-05-21 | — |
Code
| HumanEval | — | 46.3 2024-02-29 |
| LiveCodeBench | 31.6 2026-05-21 | — |
Math
| MATH | 87.6 2026-05-21 | — |
| AIME 2024 | 23.0 2026-05-21 | — |
Context · A
Google's first long-context Gemini checkpoint, introduced with a 128K standard window and a 1M token preview tier. Google described the design as a mixture-of-experts that activates a subset of expert networks per input, and demonstrated 99% recall on needle-in-a-haystack across 1M tokens at launch. The context window was later extended to 2M tokens in private preview, announced May 14 2024.
Context · B
BigCode's StarCoder 2 15B trained on 4T+ tokens of The Stack v2, a publicly released code dataset spanning 600+ languages and permissive licenses only. Sliding-window attention plus grouped-query attention gave it 16K context at the 15B scale. The accompanying data, training code, and search index for attribution were all released alongside the weights.