Glossary
decentralized training
Training a model across many independently-operated nodes that are not tightly coupled, contrasted with single-cluster training; the architecture for community-owned model production.
Training schemes designed to work across loosely-connected nodes rather than a tightly-coupled cluster. The shared problem: every node holds a partial model and a piece of data, and the bandwidth between nodes is far less than between GPUs in a single rack. Decentralized training reduces the per-step communication so that consumer-grade internet links between nodes do not bottleneck training.
The research strands include DiLoCo and its descendants (large
local steps, occasional global aggregation), federated learning (data
stays local, only gradients are shared), and swarm-training schemes
that route layer execution across volunteer nodes (Petals).
The sovereignty argument: decentralized training is the architectural counterweight to the centralization of frontierweightsThe current capability envelope of AI, defined by the most capable models in deployment at any given time; an evolving label rather than a fixed threshold. Open full entry production in five or six labs with hyperscaler clusters. The current limitation: no production-grade frontierweightsThe current capability envelope of AI, defined by the most capable models in deployment at any given time; an evolving label rather than a fixed threshold. Open full entry has been trained this way. The techniques work; the coordination, governance, and capital structure have not been demonstrated at the scale that would matter.