Coercion-capability measurement project · Grant

The Cooperative AI Foundation funded a 2025 project to develop practical methods for measuring coercive capabilities of AI agents and to model the risks associated with different capability levels. The grant text frames coercion as a dual-use property: strong coercive capabilities in deployed AI systems could enable large-scale societal harms through misuse, but some capabilities underlying coercion (specifically, increasing the credibility of commitments) are also essential for fostering cooperation between agents. The project aims to disambiguate the two by building measurement tools rather than treating coercion as a purely qualitative concern.

The grant went to Sophia Hatz at Uppsala University's Department of Peace and Conflict Research, with a funding package of 639,830 SEK (roughly $65K) covering the measurement-tool development work. Read alongside CAIF's other 2025 grants (which include FOCAL at CMU and a range of multi-agent cooperation projects), the coercion-measurement project fits a portfolio strategy of funding upstream measurement infrastructure for properties that will matter once AI agents operate autonomously in adversarial settings.

The intellectual heritage runs through the game theory of credible threats and commitments, where the line between cooperation-enabling commitment (you can credibly promise to keep your word) and coercion-enabling commitment (you can credibly threaten harm) is structurally thin. Building empirical benchmarks for this distinction in LLM-based agents is technically distinct from the existing red-teaming and dangerous-capability-evaluation traditions, which tend to focus on single-shot prompts rather than multi-step negotiation dynamics.

Within the open-source AI stack the work sits at evaluation and safety-guardrails. It is conceptually adjacent to METR's autonomy evaluations and to the dangerous-capability sections of the major lab system cards, but with a multi-agent and game-theoretic frame rather than a single-agent capability frame.

Recipient

Sophia Hatz (Uppsala University)

Funder

Cooperative AI Foundation · foundation · UK

Funds research that improves AI agents' capacity for cooperation with each other and with humans, including measurement of cooperation-relevant capabilities and propensities.

Primary source

https://www.cooperativeai.com/post/grant-summaries

Additional sources

More from Cooperative AI Foundation

FOCAL lab at Carnegie Mellon 2021-09-01 · $500,000 (2021-2025)

Multi-year grant establishing the Foundations of Cooperative AI Lab (FOCAL) under Vincent Conitzer at Carnegie Mellon University. The lab develops decision and game theory for cooperation between advanced machine agents, with outputs including workshops, online seminar series, and visitor programs.