Models · Compare

Apple On-Device Foundation Model (2025) vs DeepSeek-R1 (May 2025 refresh)

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field	A: Apple On-Device Foundation Model (2025)	B: DeepSeek-R1 (May 2025 refresh)
Released	2025-06-09	2025-05-28
Developer	Apple	DeepSeek
Openness	Proprietary	Open
License	Proprietary	MIT
OSI-approved	no	yes
Data released	no	no
Training code	no	no
Architecture	dense	moe
Total params	3B	—
Active params	—	—
Experts	—	—
Context window	—	—
Attention	unknown	mla
Position enc.	unknown	rope-yarn
Pretraining tokens	—	—
Post-training	sft, rlhf	sft, grpo, rejection-sampling
Training hardware	—	H800
$/M input	—	—
$/M output	—	—
Output tok/sec	—	—

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro	—	85.0 2025-05-28
GPQA-Diamond	—	81.0 2025-05-28

Code

LiveCodeBench

—

73.3 2025-05-28

Math

AIME 2024

—

91.4 2025-05-28

Context · A

Apple Intelligence's on-device foundation model, announced WWDC 2025 on June 9 and shipped in iOS 26. About 3B parameters with KV-cache sharing across blocks (37.5 percent KV cache reduction) and 2-bit quantization-aware training, paired with a server-side Parallel-Track Mixture-of-Experts model on Private Cloud Compute. Foundation Models framework opened direct model access to developers.

Context · B

An RL-only refresh of R1 that gained substantial ground on reasoning benchmarks (notably AIME 2024) without any new pretraining. Tightened the open-vs-closed reasoning gap.

Apple On-Device Foundation Model (2025) detail → · DeepSeek-R1 (May 2025 refresh) detail →