The Open-Source AI Stack
RSS

Glossary

GDDR7

The graphics memory generation on 2025-era consumer and workstation GPUs such as the RTX 5090 and RTX PRO 6000. High bandwidth per board, lower capacity than HBM.

GDDR7 is the graphics-DRAM generation used on workstation and high-end consumer GPUs in the 2025 to 2026 window. It sits between low-power LPDDR5XsiliconLow-power DRAM used as unified memory in Apple Silicon, DGX Spark, and Strix Halo. High capacity and efficiency, with bandwidth below HBM and GDDR. Open full entry and stacked HBMsiliconStacked DRAM used as the main memory of every modern AI accelerator, with bandwidth in TB/s rather than GB/s and capacity per stack in tens of GB. Open full entry on the memory bandwidthsiliconThe rate (GB/s or TB/s) at which an accelerator reads its memory. It sets the ceiling on decode tokens/sec, since each token streams the active weights once. Open full entry ladder: a board like the RTX 5090 reaches about 1.8 TB/s, enough to make it competitive with older datacenter parts on decode-bound work where the model fits.

The constraint with GDDR7 is capacity. It is mounted as discrete chips around the GPU rather than stacked on an interposer, so per-board capacity stays modest: 24 to 32 GB on consumer cards, up to 96 GB on the workstation RTX PRO 6000. That capacity ceiling is why consumer GPUs hit a wall on 70B-class models even though their bandwidth is ample, and why unified memorysiliconA single physical memory pool shared by CPU and GPU, so the full capacity is usable as model memory; used by Apple Silicon, Strix Halo, and DGX Spark. Open full entry or multi-GPU is the alternative for large models.

For self-hosting, a GDDR7 card is the high-bandwidth, modest-capacity option: fast for what fits, capacity-limited for what does not. The quantizationweightsStoring or computing model weights in lower-precision number formats (FP8, INT8, INT4) to reduce memory and bandwidth, accepting small quality loss. Open full entry choice is what bridges the gap between the two.

Sources

Mentioned in

Back to glossary