The Open Multimodal AI Infrastructure (OMAI) partnership is a $152M Mid-Scale Research Infrastructure award from the U.S. National Science Foundation ($75M) and NVIDIA ($77M), led by Noah A. Smith at the Allen Institute for AI (Ai2). The official Ai2 announcement is dated August 14, 2025; the compute cluster itself came online in May 2026. The program funds compute hardware plus the open data and open model lineage Ai2 already ships.
The cluster runs on NVIDIA HGX B300 systems powered by Blackwell Ultra GPUs, deployed and managed in partnership with Cirrascale Cloud Services with Supermicro supplying the underlying platforms. Co-PIs sit at the University of Washington (Hanna Hajishirzi), University of Hawai'i at Hilo (Travis Mandel), University of New Hampshire (Samuel Carton), and University of New Mexico (Sarah Dreier). The award is structured as research infrastructure (compute time plus open release), not a model-training contract.
Models trained on the cluster slot into Ai2's existing lineage: OLMo (text language models) and Molmo (multimodal). OLMo 3, released November 2025, ships 7B and 32B dense transformer variants with a 65,536-token context window, all four variants (Base, Think, Instruct, RL Zero) under Apache 2.0. The training stack is fully open: Dolma 3 is a 9.3T-token pretraining corpus combining web text, scientific PDFs, and code repos, of which the "Dolma 3 Mix" 5.9T-token subset is what OLMo 3 base models actually train on; Dolmino and Longmino are filtered high-quality subsets used for mid-training, and Dolci is the post-training data suite for SFT, DPO, and RLVR.
What distinguishes Ai2 from "open weights" labs is that the data recipes, training code, intermediate checkpoints, and evaluation suites (OLMES, OlmoBaseEval) are released alongside the weights. There is no closed-source corpus held back; the model flow can be reproduced or modified end-to-end. The Apache 2.0 license permits commercial use without per-token licensing.
For the open-source AI thesis, OMAI is the largest single U.S. public commitment to fully open foundation-model training to date. The structure pairs federal research funding with industry-supplied accelerators rather than routing public dollars through a closed lab. It positions Ai2 as one of the few institutions globally (alongside European programs like OpenEuroLLM) with both frontier compute and a mandate to release weights, data, and training code together.
Recipient
Allen Institute for AI
Funder
National Science Foundation · government · US
Largest sustained US AI research funder. AI Research Institutes program and partner programs (e.g., NSF-NVIDIA OMAI).