Technical AI Safety RFP

Open Philanthropy (now operating as Coefficient Giving) ran a Technical AI Safety request for proposals that closed April 15, 2025, planning roughly $40M in grants disbursed over the following five months. The RFP partitioned funding across 21 distinct research areas organized into five categories, signaling a deliberately wide aperture rather than a single named research bet.

The five categories are Adversarial Machine Learning (jailbreaks, control evaluations, backdoors and alignment stress tests, alternatives to adversarial training, robust unlearning), Exploring Sophisticated LLM Misbehavior (alignment faking experiments, encoded reasoning in chain-of-thought and inter-model communication, black-box LLM psychology, evaluating hidden dangerous behaviors, reward hacking of human oversight), Model Transparency (white-box techniques, activation monitoring, feature representations, toy models, externalized reasoning, interpretability benchmarks, more transparent architectures), Trust from First Principles (white-box estimation of rare misbehavior, theoretical study of inductive biases), and Alternative Risk Mitigation Approaches (conceptual clarity, moonshots for superintelligence alignment).

Eight areas were starred as funder priorities: jailbreaks and unintentional misalignment, control evaluations, backdoors, alternatives to adversarial training, alignment faking experiments, and encoded reasoning in CoT. The starred set skews toward empirical work on deceptive or hidden behavior in current frontier models rather than longer-horizon theoretical agendas. Grant sizes were not pre-specified; the program absorbed proposals from individuals through multi-researcher teams.

The funder rebranded mid-cycle: the original openphilanthropy.org RFP URL now redirects to coefficientgiving.org. The 21-area structure remains the canonical reference grid for what the funder considers in-scope for technical safety, and other funders (the Alignment Project, AISF) have echoed similar categorical breakdowns.

As a single-cycle commitment, the $40M figure made this one of the largest external technical safety pools opened in 2025. It is a useful baseline for what mid-eight-figure safety RFPs look like at scale, and the area list is downstream-readable as a signal of which topics the largest U.S. AI-safety funder considers tractable.

Recipient

Multiple grantees

Funder

Open Philanthropy (Coefficient Giving) · foundation · US

Technical AI safety research across 21 areas. The default mover in mid-7-figure safety grants.

Primary source

https://www.openphilanthropy.org/request-for-proposals-technical-ai-safety-research/

Recipient

Funder

Primary source

Additional sources

More from Open Philanthropy (Coefficient Giving)