StarCoder 2 15B

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: dense
Total params: 15B
Active params: 15B
Context window: 16K tokens
Attention: hybrid-gqa-sliding
Position encoding: rope
Pretraining tokens: 4.0T
Training hardware: H100
OSI-approved: no
Data released: yes
Training code: released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

Code

HumanEval

46.3

as of 2024-02-29

source ↗

Recommended use cases

code completion and FIM
research baseline with full data release
fine-tune starting point for code chat

Available quantizations

GGUF llama.cpp's container; the common local format, k-quants from Q2 to Q8. runs on llama.cpp, Ollama

AWQ Activation-aware 4-bit weight quantization for GPU serving. runs on vLLM, SGLang

MLX Apple MLX 4/8-bit layout for Apple silicon. runs on Apple MLX

bitsandbytes On-the-fly NF4 / INT8 weight quantization inside Transformers. runs on Transformers

Verified via the Hugging Face model tree ↗. Community quantizations change over time; the families shown are those with published weights at audit time.

Notable innovations

· Full data release via The Stack v2
· Pretraining attribution search index
· Fill-in-the-middle objective at 4T tokens

Known limitations

· Base model is not instruction-tuned; designed for code completion rather than chat. source ↗
· BigCode OpenRAIL-M v1 carries use restrictions and is not OSI-approved. source ↗

Lineage

Successor to StarCoder and StarCoder Plus. Siblings at 3B and 7B in the StarCoder 2 family. The Stack v2 dataset supersedes The Stack v1.