The Open-Source AI Stack
RSS

Grants · Project grant · EU

Coercion-capability measurement project

Develops practical methods to measure coercive capabilities of AI agents and to model the risks associated with different levels of those capabilities.

The Cooperative AI Foundation funded a 2025 project to develop practical methods for measuring coercive capabilities of AI agents and to model the risks associated with different capability levels. The grant text frames coercion as a dual-use property: strong coercive capabilities in deployed AI systems could enable large-scale societal harms through misuse, but some capabilities underlying coercion (specifically, increasing the credibility of commitments) are also essential for fostering cooperation between agents. The project aims to disambiguate the two by building measurement tools rather than treating coercion as a purely qualitative concern.

The grant went to Sophia Hatz at Uppsala University's Department of Peace and Conflict Research, with a funding package of 639,830 SEK (roughly $65K) covering the measurement-tool development work. Read alongside CAIF's other 2025 grants (which include FOCAL at CMU and a range of multi-agent cooperation projects), the coercion-measurement project fits a portfolio strategy of funding upstream measurement infrastructure for properties that will matter once AI agents operate autonomously in adversarial settings.

The intellectual heritage runs through the game theory of credible threats and commitments, where the line between cooperation-enabling commitment (you can credibly promise to keep your word) and coercion-enabling commitment (you can credibly threaten harm) is structurally thin. Building empirical benchmarks for this distinction in LLM-based agents is technically distinct from the existing red-teaming and dangerous-capability-evaluation traditions, which tend to focus on single-shot prompts rather than multi-step negotiation dynamics.

Within the open-source AI stack the work sits at evaluation and safety-guardrails. It is conceptually adjacent to METR's autonomy evaluations and to the dangerous-capability sections of the major lab system cards, but with a multi-agent and game-theoretic frame rather than a single-agent capability frame.

Recipient

Sophia Hatz (Uppsala University)

Funder

Cooperative AI Foundation · foundation · UK

Funds research that improves AI agents' capacity for cooperation with each other and with humans, including measurement of cooperation-relevant capabilities and propensities.

Primary source

https://www.cooperativeai.com/post/grant-summaries

Additional sources

More from Cooperative AI Foundation