The Open-Source AI Stack
RSS

Grants · Project grant · UK

Data for the AIs

Building open data resources for AI training. 50th cohort Emergent Ventures award.

Laura Ryan, based in London, was selected by Emergent Ventures in its 50th cohort (December 2025) for work on open data resources for AI training. Like other Emergent Ventures grants, the award is at Tyler Cowen's discretion through the Mercatus Center, in the program's typical $1K to $50K range, and is intended as catalytic funding for an individual at the very early phase of a project.

The data layer is one of the persistent gaps for open-weights models. Crawled web text (Common Crawl, FineWeb, Dolma, the Common Pile) covers general pretraining, but specialized domain corpora, instruction-tuning sets, and high-quality evaluation data are thinner. Closed labs are believed to spend substantial internal headcount on curation and licensing, with the resulting datasets remaining private. Emergent Ventures has funded several adjacent open-data and math-corpora efforts in recent cohorts, including the Reyansh Sharma grant for open-source math earlier in 2025.

Public detail beyond the cohort announcement is not yet available; Ryan's specific dataset, licensing model, and target capability are not described in the Marginal Revolution post.

Recipient

Laura Ryan

Funder

Emergent Ventures (Mercatus Center) · foundation · Global

Tyler Cowen's discretionary grant program at George Mason University's Mercatus Center; funds individuals working on under-supplied ideas including AI tools, AI policy, and AI for science.

Primary source

https://marginalrevolution.com/marginalrevolution/2025/12/emergent-ventures-winners-50th-cohort.html

Additional sources

More from Emergent Ventures (Mercatus Center)