The Open-Source AI Stack
RSS
All models

Models · kimi-k2

Kimi K2.5

Open Moonshot AI · 2026-01-27 · Modified MIT

Native multimodal Kimi at 1T total / 32B active with MoonViT (400M-parameter vision encoder), trained on roughly 15T mixed visual and text tokens. 256K context, Modified MIT license, January 27 2026 release. Adds visual reasoning, video understanding, and UI-to-code workflows.

Cost

$0.58 / Mtok input
$3.00 / Mtok output

· as of 2026-05-21

source ↗

Speed

33.9 tok/sec output
1335 ms TTFT

· as of 2026-05-21

source ↗

Architecture

tokens in Embedding vocab 160,000 · kimi tokenizer × 61 layers Multi-head Latent Attention RoPE context 256,000 tokens MoE Router 384 experts total · 8 active per token shown: 32 of 384 Output projection tokens out
Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture
moe
Total params
1T
Active params
32B
Experts
384 total · 8 active
Context window
256K tokens
Attention
mla
Position encoding
rope
Pretraining tokens
15.0T
Post-training
sft, rlhf
OSI-approved
no
Data released
no
Training code
not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

GPQA-Diamond 87.6 as of 2026-01-27 source ↗

Code

SWE-Bench Verified 76.8 as of 2026-01-27 source ↗
LiveCodeBench 85.0 as of 2026-01-27 source ↗

Math

AIME 2025 96.1 as of 2026-01-27 source ↗

Available quantizations

GGUF llama.cpp's container; the common local format, k-quants from Q2 to Q8. runs on llama.cpp, Ollama
MLX Apple MLX 4/8-bit layout for Apple silicon. runs on Apple MLX
FP8 8-bit float, frequently a native release on Hopper / Blackwell GPUs. runs on vLLM, SGLang, TensorRT-LLM

Verified via the Hugging Face model tree ↗. Community quantizations change over time; the families shown are those with published weights at audit time.

Notable innovations

  • · Native multimodal Kimi (image, video, text)
  • · 400M-parameter MoonViT vision encoder
  • · Continual pretraining on ~15T mixed-modality tokens

Lineage

First multimodal Kimi in the K2 family; continual pretraining on ~15T mixed-modality tokens; adds MoonViT vision and video understanding.

Derivatives

Sources