OLMo (AI2)

OLMo is a family of open language models from the Allen Institute for AI (AI2). What makes OLMo distinctive is not the capability ceiling, which is below the frontier, but the disclosure depth. Apache 2.0 weights, full training data (Dolma corpus), training code, evaluation suite, intermediate checkpoints, and training logs are all published. The training pipeline is reproducible end-to-end. No other major model family in 2026 meets this bar. OLMo matters because it answers the question "what does 'open source AI' actually look like at the strictest reading of OSAID v1.0?" Llama, Mistral, Qwen, DeepSeek, and Gemma all publish weights, but the training data is closed for all of them. Under the Open Source Initiative's published definition, only OLMo qualifies as fully open. This makes OLMo the existence proof that frontier-class training-data disclosure is feasible at all, and the reference example anyone arguing for stricter open-AI definitions can point to. Production-ready as a research and reference model. OLMo 2 (32B, Mar 2025) and OLMo 3 family (Nov 2025) closed much of the capability gap with similar-size open-weights peers, though not with the frontier. Strong for academic use, reproducibility research, post-training experimentation. Less competitive with Llama 3 / Qwen 3 / DeepSeek for raw application performance. Stewarded by AI2; long-term direction depends on AI2's continued funding for the program.

Other projects at the Weights layer

9 siblings · ordered open first

Mistral / Mixtral Open source

French lab; older open releases under Apache 2.0; flagships increasingly API-only or under research-tier licenses.

Qwen (Alibaba) Open source

Alibaba's aggressive open-weights series (Qwen 2.5 / 3); Apache 2.0 across most sizes; full-precision weights available.

DeepSeek V3 / R1 Open source

Cost-quality reset; V3 papers documented architectural innovations (MoE, MLA, aux-loss-free MoE); R1 open reasoning model.

Phi (Microsoft) Open source

Small open models heavy on synthetic-data training; MIT license; cost-effective inference at edge sizes.

Kimi (Moonshot AI) Open source

Chinese open-weights series; emphasis on long-context performance.

GLM (Zhipu AI) Open source

Tsinghua-spinoff; ChatGLM and GLM-4 families; Apache 2.0 for major releases.

Yi (01.AI) Open source

Kai-Fu Lee's Chinese open model family (Yi-34B etc.); Apache 2.0.

Llama (Meta) Source available

Meta's open-weights family; dominant in usage; license carries a 700M-MAU clause and acceptable-use restrictions.

Gemma (Google) Source available

Google's open-weights siblings to Gemini; source-available, not OSI-approved.

Sources

Other projects at the Weights layer

Grants attributed