OLMo is a family of open language models from the Allen Institute for AI (AI2). What makes OLMo distinctive is not the capability ceiling, which is below the frontier, but the disclosure depth. Apache 2.0 weights, full training data (Dolma corpus), training code, evaluation suite, intermediate checkpoints, and training logs are all published. The training pipeline is reproducible end-to-end. No other major model family in 2026 meets this bar. OLMo matters because it answers the question "what does 'open source AI' actually look like at the strictest reading of OSAID v1.0?" Llama, Mistral, Qwen, DeepSeek, and Gemma all publish weights, but the training data is closed for all of them. Under the Open Source Initiative's published definition, only OLMo qualifies as fully open. This makes OLMo the existence proof that frontier-class training-data disclosure is feasible at all, and the reference example anyone arguing for stricter open-AI definitions can point to. Production-ready as a research and reference model. OLMo 2 (32B, Mar 2025) and OLMo 3 family (Nov 2025) closed much of the capability gap with similar-size open-weights peers, though not with the frontier. Strong for academic use, reproducibility research, post-training experimentation. Less competitive with Llama 3 / Qwen 3 / DeepSeek for raw application performance. Stewarded by AI2; long-term direction depends on AI2's continued funding for the program.
The Stack · Weights · Open source
OLMo (AI2)
The only major model family meeting the strictest reading of OSAID: data (Dolma), training code, and weights all published.
Sources
- OLMo at AI2 https://allenai.org/olmo
- OLMo 2 Release https://allenai.org/blog/olmo2
- Dolma Corpus https://huggingface.co/datasets/allenai/dolma
- Open Source AI Definition (OSI) https://opensource.org/ai/open-source-ai-definition
Want a follow-up? Ask the chat about OLMo (AI2) in context. It will compare to siblings at the same layer and ground every claim in the wiki.
Other projects at the Weights layer
9 siblings · ordered open first
- Mistral / Mixtral Open source
French lab; older open releases under Apache 2.0; flagships increasingly API-only or under research-tier licenses.
- Qwen (Alibaba) Open source
Alibaba's aggressive open-weights series (Qwen 2.5 / 3); Apache 2.0 across most sizes; full-precision weights available.
- DeepSeek V3 / R1 Open source
Cost-quality reset; V3 papers documented architectural innovations (MoE, MLA, aux-loss-free MoE); R1 open reasoning model.
- Phi (Microsoft) Open source
Small open models heavy on synthetic-data training; MIT license; cost-effective inference at edge sizes.
- Kimi (Moonshot AI) Open source
Chinese open-weights series; emphasis on long-context performance.
- GLM (Zhipu AI) Open source
Tsinghua-spinoff; ChatGLM and GLM-4 families; Apache 2.0 for major releases.
- Yi (01.AI) Open source
Kai-Fu Lee's Chinese open model family (Yi-34B etc.); Apache 2.0.
- Llama (Meta) Source available
Meta's open-weights family; dominant in usage; license carries a 700M-MAU clause and acceptable-use restrictions.
- Gemma (Google) Source available
Google's open-weights siblings to Gemini; source-available, not OSI-approved.
Grants attributed
1 match from /grants
- OLMo 3 / Molmo 2 release line 2025-11 · Internal Ai2 funding
AI2 internal teams · funded by ai2
OLMo 3 family released November 2025; OLMo 3.1 December 2025; Molmo 2 video December 2025. The only major fully-open model lineage.