DeepSeek V3 / R1

DeepSeek is a Chinese frontier-AI lab. Their open releases reset two ceilings in late 2024-early 2025. DeepSeek-V3 (Dec 2024) demonstrated frontier-class capability trained for an order of magnitude less than US frontier labs were spending, using a Mixture-of-Experts architecture with multiple architectural innovations (Multi-head Latent Attention, auxiliary-loss-free MoE balancing). DeepSeek-R1 (Jan 2025) was the first openly-released frontier-class reasoning model, with R1-Zero showing that pure reinforcement-learning post-training could produce strong reasoning without the typical SFT step. DeepSeek matters because it forced a re-evaluation of what open-weights labs can accomplish on a constrained compute budget. Compared to siblings: Llama (similar capability ceiling but restrictive license vs DeepSeek's MIT), Qwen (similar Chinese-lab open posture but different architectural choices), OLMo (truly open including data, smaller capability ceiling). DeepSeek's distinctive angle is "frontier capability on permissive license (MIT), at a cost structure that breaks the hyperscaler-budget assumption." Production-ready; widely served by hosted-inference providers and self-hosters. Subsequent V3.1 (mid-2025) introduced hybrid reasoning (one model with thinking and non-thinking modes). Caveats: training data is not disclosed (does not satisfy strict OSAID), and the trust questions about data provenance and political content filtering apply as they do to any closed- data lab.

Other projects at the Weights layer

9 siblings · ordered open first

Mistral / Mixtral Open source

French lab; older open releases under Apache 2.0; flagships increasingly API-only or under research-tier licenses.

Qwen (Alibaba) Open source

Alibaba's aggressive open-weights series (Qwen 2.5 / 3); Apache 2.0 across most sizes; full-precision weights available.

OLMo (AI2) Open source

The only major model family meeting the strictest reading of OSAID: data (Dolma), training code, and weights all published.

Phi (Microsoft) Open source

Small open models heavy on synthetic-data training; MIT license; cost-effective inference at edge sizes.

Kimi (Moonshot AI) Open source

Chinese open-weights series; emphasis on long-context performance.

GLM (Zhipu AI) Open source

Tsinghua-spinoff; ChatGLM and GLM-4 families; Apache 2.0 for major releases.

Yi (01.AI) Open source

Kai-Fu Lee's Chinese open model family (Yi-34B etc.); Apache 2.0.

Llama (Meta) Source available

Meta's open-weights family; dominant in usage; license carries a 700M-MAU clause and acceptable-use restrictions.

Gemma (Google) Source available

Google's open-weights siblings to Gemini; source-available, not OSI-approved.

Sources

Other projects at the Weights layer