The Open-Source AI Stack
RSS

The Stack · Silicon · Proprietary

Apple Silicon (M-series)

Unified memory architecture; closed silicon, but the strongest on-device inference platform via llama.cpp and MLX.

Proprietary · stable · Project site →

Apple's M-series chips (M1, M2, M3, M4 across Pro, Max, Ultra tiers) are SoCs designed for laptops and desktops. Closed silicon on a custom ARM-based ISA, closed accelerator (the Neural Engine plus the GPU). What makes them matter for open-source AI is the unified memory architecture: CPU, GPU, and Neural Engine share the same physical memory, so a 192GB Mac Studio has 192GB of "VRAM" for inference purposes. For local AI specifically (single user, batch size 1), inference is bandwidth-bound: tokens per second is capped by memory bandwidth divided by model size in bytes. Apple's wide LPDDR5X bus delivers 273-800 GB/s depending on tier (M4 Pro to M2/M3 Ultra). That, combined with high memory capacity, makes Macs the only consumer hardware on the market that can hold a 70B model and run it at usable speed. NVIDIA's consumer cards (4090, 5090) cap memory capacity well below what 70B models need. AMD's Strix Halo (96GB unified memory) is the first credible non-Apple alternative. Production-ready and the de facto local-AI substrate in 2026. Used by llama.cpp (Apple Silicon backend is among the most optimized), Ollama (which wraps llama.cpp), and MLX (Apple's open ML framework specifically targeting Apple Silicon). The sovereignty critique: Apple's closed-everything ecosystem means "local AI on a Mac" still ties you to Apple's hardware and software roadmap. Genuinely open local-AI silicon at this capability remains a future bet (Tenstorrent, Strix Halo).

Sources

Want a follow-up? Ask the chat about Apple Silicon (M-series) in context. It will compare to siblings at the same layer and ground every claim in the wiki.

Other projects at the Silicon layer

6 siblings · ordered open first