Architecture
data/models.yaml. Every label is auditable
against the model's sources.
Specs
- Architecture
- moe
- Total params
- 236B
- Active params
- 21B
- Experts
- 160 total · 6 active
- Context window
- 128K tokens
- Attention
- mla
- Position encoding
- rope-yarn
- Pretraining tokens
- 6.0T
- Post-training
- sft, rlhf
- OSI-approved
- no
- Data released
- no
- Training code
- not released
Benchmarks
Each score carries the date it was published; we never infer or interpolate missing scores.
Recommended use cases
- repo-scale code generation
- multi-language code assistance
- code reasoning at MoE economics
Available quantizations
GGUF llama.cpp's container; the common local format, k-quants from Q2 to Q8.
runs on llama.cpp, Ollama
Verified via the Hugging Face model tree ↗. Community quantizations change over time; the families shown are those with published weights at audit time.
Notable innovations
- · MoE code model at GPT-4 Turbo HumanEval parity
- · 338 programming languages
- · Context extended to 128K for repo-scale tasks
Lineage
Coder branch of the V2 line, derived from DeepSeek-V2 base (not from the V2-Chat sibling); later folded into V2.5.