Glossary

DeepSeek

A Chinese open-weight family known for the V3 MoE base model and the R1 reasoning model, both released under permissive licenses and unusually transparent in their training-cost reporting.

Weights also: Training

A 2023-founded Chinese lab that became one of the most-watched open labs through late-2024 and 2025. DeepSeek-V3 (December 2024) was a 671B-parameter mixture of expertsweightsA model architecture where each token activates only a fraction of total parameters by routing through learned expert subnetworks, decoupling capacity from compute. Open full entry with 37B active per token, released under a permissive license and accompanied by an unusually candid technical report including training cost (~$5.6M for the final run). R1 (January 2025) demonstrated reasoning-mode behavior trained largely from RL on verifiable rewards, with the recipe published.

The combination (frontierweightsThe current capability envelope of AI, defined by the most capable models in deployment at any given time; an evolving label rather than a fixed threshold. Open full entry -class capability, open weightsweightsA model release that publishes the trained parameters under some downloadable license, distinct from "open source" which (per OSAID) also requires data and training-code openness. Open full entry , detailed technical reporting) was unusual enough to reset community expectations of what open releases could include.

Full coverage at /projects/deepseek.

Sources

Mentioned in

Back to glossary