The Open-Source AI Stack
RSS
All models

Models · phi-4

Phi-4 Reasoning

Open Microsoft · 2025-04-30 · MIT

14B reasoning-tuned Phi-4 derivative, SFT-only on curated reasoning traces and synthetic prompts. Trained in 2.5 days on 32 H100-80G GPUs over 16B tokens, with the Plus variant adding an RL stage. Microsoft positioned it as DeepSeek R1 territory at much smaller scale.

Architecture

tokens in Embedding vocab not disclosed × N layers Attention (not disclosed) Position encoding not disclosed context 32,000 tokens Dense MLP SwiGLU activation (standard) 14B active params Output projection tokens out
Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture
dense
Total params
14B
Active params
14B
Context window
32K tokens
Attention
unknown
Position encoding
unknown
Pretraining tokens
16B
Training hardware
H100
Post-training
sft, rl
OSI-approved
yes
Data released
no
Training code
not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU-Pro 74.3 as of 2025-04-30 source ↗
GPQA-Diamond 65.8 as of 2025-04-30 source ↗

Code

LiveCodeBench 53.8 as of 2025-04-30 source ↗

Math

AIME 2024 75.3 as of 2025-04-30 source ↗
AIME 2025 62.9 as of 2025-04-30 source ↗

Available quantizations

GGUF llama.cpp's container; the common local format, k-quants from Q2 to Q8. runs on llama.cpp, Ollama
AWQ Activation-aware 4-bit weight quantization for GPU serving. runs on vLLM, SGLang
MLX Apple MLX 4/8-bit layout for Apple silicon. runs on Apple MLX
FP8 8-bit float, frequently a native release on Hopper / Blackwell GPUs. runs on vLLM, SGLang, TensorRT-LLM
bitsandbytes On-the-fly NF4 / INT8 weight quantization inside Transformers. runs on Transformers

Verified via the Hugging Face model tree ↗. Community quantizations change over time; the families shown are those with published weights at audit time.

Notable innovations

  • · SFT-only reasoning recipe at 14B
  • · 16B-token reasoning post-training corpus
  • · Phi-4-reasoning-plus variant adds RL stage

Lineage

SFT-only reasoning derivative of Phi-4 at 14B.

Derived from

Phi-4 2024-12-12

Sources