AMD's data-center AI accelerator line. MI300X shipped late 2023 with 192GB of HBM3 (more memory than any other accelerator at the time); MI325X refreshed to 256GB HBM3e. Closed hardware, but the surrounding ROCm software stack is open-source-leaning (most of ROCm is publicly licensed, unlike CUDA's source). Compared to NVIDIA H100/H200: MI300X has substantially more memory per accelerator (192GB vs 80GB for H100), so it can hold larger models without sharding. Raw FLOPS are competitive. The gap is on software: ROCm lags CUDA by years on per-framework optimization. vLLM and other open runtimes support AMD, but production deployments still skew NVIDIA. AMD's positioning is "the credible non-NVIDIA option for inference at scale," not "the leader." Production-ready and shipping at scale. Major hyperscaler deployments confirmed (Microsoft Azure, Meta have publicly run MI300X clusters for inference). Hugging Face supports ROCm first-class. The strategic question for AMD is whether ROCm closes the software gap fast enough to break NVIDIA's lock-in before the next generation extends it.
The Stack · Silicon · Proprietary
AMD MI300X / MI325X
Highest-memory accelerator on the market (192 GB+ HBM); ROCm software stack open-source-adjacent.
Sources
- AMD Instinct MI300X Product Page https://www.amd.com/en/products/accelerators/instinct/mi300/mi300x.html
- ROCm Documentation https://rocm.docs.amd.com/
- Microsoft Azure ND MI300X Announcement https://azure.microsoft.com/en-us/blog/azure-announces-new-ai-optimized-vm-series-featuring-amds-flagship-mi300x-gpu/
- amd.com (audit-verified) https://www.amd.com/en/products/accelerators/instinct/mi300/mi325x.html
- techcommunity.microsoft.com (audit-verified) https://techcommunity.microsoft.com/blog/azurehighperformancecomputingblog/introducing-the-new-azure-ai-infrastructure-vm-series-nd-mi300x-v5/4145152
- engineering.fb.com (audit-verified) https://engineering.fb.com/2024/10/15/data-infrastructure/metas-open-ai-hardware-vision/
Want a follow-up? Ask the chat about AMD MI300X / MI325X in context. It will compare to siblings at the same layer and ground every claim in the wiki.
Other projects at the Silicon layer
6 siblings · ordered open first
- Tenstorrent (Wormhole, Blackhole) Open source
Open-trending AI accelerators on RISC-V; Jim Keller-led; tt-metal and tt-forge open.
- RISC-V Open source
Open instruction set architecture; royalty-free; substrate for open silicon (CPUs and emerging AI accelerators).
- NVIDIA H100 / H200 Proprietary
Hyperscaler-class AI accelerator with CUDA software moat; default frontier-training and frontier-inference hardware.
- Cerebras CS-3 Proprietary
Wafer-scale accelerator; proprietary but disruptive on inference economics for specific model sizes.
- Groq LPU Proprietary
Language Processing Unit; proprietary; extraordinarily fast inference for small-to-medium models at low batch sizes.
- Apple Silicon (M-series) Proprietary
Unified memory architecture; closed silicon, but the strongest on-device inference platform via llama.cpp and MLX.