The Open-Source AI Stack
RSS
All hardware

Hardware · Apple unified

Apple Mac Studio (M3 Ultra)

Apple

The capacity champion among single-box consumer hardware: up to 512 GB of unified memory usable as VRAM, at bandwidth that approaches a workstation GPU. Apple states bandwidth as over 800 GB/s.

CPU · GPU · NPU 819 GB/s memory bus 512 GB Unified memory
Apple unified. Bus width tracks bandwidth (819 GB/s, sets decode speed); the box tracks capacity (512 GB, sets what fits). Unified memory is one pool shared by CPU and accelerator.

Specs

Memory
512 GB Unified LPDDR5X
Bandwidth
819 GB/s
Power
270 W
Form factor
soc
Interconnect
none
Released
2025-03

What it runs (single unit, Q4_K_M, 4K context)

Model Fits? Decode ceiling
Llama 3.1 8B Instruct yes ~163 tok/s
Llama 3.3 70B Instruct yes ~20 tok/s
Qwen 2.5 72B Instruct yes ~20 tok/s
DeepSeek-V3 yes ~39 tok/s

Ceiling is the theoretical rooflineruntimeA performance model that bounds throughput by either compute or memory bandwidth, whichever is the limiting resource for an operation's arithmetic intensity. Open full entry ; open the explorer to set quant, context, and runtime and see the realistic range.

Sources