The Open-Source AI Stack
RSS
All models

Models · Compare

Apple On-Device Foundation Model (2025) vs DeepSeek-R1 (May 2025 refresh)

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field A: Apple On-Device Foundation Model (2025) B: DeepSeek-R1 (May 2025 refresh)
Released 2025-06-092025-05-28
Developer AppleDeepSeek
Openness ProprietaryOpen
License ProprietaryMIT
OSI-approved noyes
Data released nono
Training code nono
Architecture densemoe
Total params 3B
Active params
Experts
Context window
Attention unknownmla
Position enc. unknownrope-yarn
Pretraining tokens
Post-training sft, rlhfsft, grpo, rejection-sampling
Training hardware H800
$/M input
$/M output
Output tok/sec

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro 85.0 2025-05-28
GPQA-Diamond 81.0 2025-05-28

Code

LiveCodeBench 73.3 2025-05-28

Math

AIME 2024 91.4 2025-05-28

Context · A

Apple Intelligence's on-device foundation model, announced WWDC 2025 on June 9 and shipped in iOS 26. About 3B parameters with KV-cache sharing across blocks (37.5 percent KV cache reduction) and 2-bit quantization-aware training, paired with a server-side Parallel-Track Mixture-of-Experts model on Private Cloud Compute. Foundation Models framework opened direct model access to developers.

Context · B

An RL-only refresh of R1 that gained substantial ground on reasoning benchmarks (notably AIME 2024) without any new pretraining. Tightened the open-vs-closed reasoning gap.

Apple On-Device Foundation Model (2025) detail → · DeepSeek-R1 (May 2025 refresh) detail →