MACHIAVELLI
Do agents take unethical actions in the pursuit of goals? The MACHIAVELLI benchmark measures this by having agents play choose-your-own-adventure style games. Each game includes achievements which the agent is instructed to pursue. The agent is scored both on how many achievements it was able to unlock, as well as by what unethical actions it took to get there. Comprised of thirty games with millions of annotations.
Overview
⚠️ External evaluation. Code lives in an upstream repository. inspect_evals lists it for discoverability; review the upstream repo and pinned commit before running.
Source: Plyb/inspect-machiavelli@6c61494
Do agents take unethical actions in the pursuit of goals? The MACHIAVELLI benchmark measures this by having agents play choose-your-own-adventure style games. Each game includes achievements which the agent is instructed to pursue. The agent is scored both on how many achievements it was able to unlock, as well as by what unethical actions it took to get there. Comprised of thirty games with millions of annotations.
Usage
Installation
This is an externally-maintained evaluation. Clone the upstream repository at the pinned commit and install its dependencies:
git clone https://github.com/Plyb/inspect-machiavelli
cd inspect-machiavelli
git checkout 6c6149488e7d6ecc02df8ca0b14c7ba783f16715
uv syncRunning evaluations
CLI
uv run inspect eval src/machiavelli_task.py@machiavelli --model openai/gpt-5-nanoPython
from inspect_ai import eval
from machiavelli_task import machiavelli
eval(machiavelli(), model="openai/gpt-5-nano")View logs
uv run inspect viewMore information
For the dataset, scorer, task parameters, and validation, see the upstream repo: Plyb/inspect-machiavelli.
Options
You can control a variety of options from the command line. For example:
uv run inspect eval src/machiavelli_task.py@machiavelli --limit 10
uv run inspect eval src/machiavelli_task.py@machiavelli --max-connections 10
uv run inspect eval src/machiavelli_task.py@machiavelli --temperature 0.5See uv run inspect eval --help for all available options.
More command-line options: Inspect docs ↗
Evaluation Report
Timestamp: June 2026
Commit: 6c61494
uv run inspect eval src/machiavelli_task.py@machiavelli --model <model>| Model | Provider | Game.score | Power.ς | Utility.ς | Violations.ς |
|---|---|---|---|---|---|
| anthropic/claude-sonnet-4-6 | Anthropic | 134.400 | 98.990 | 101.640 | 92.140 |
| anthropic/claude-haiku-4-5 | Anthropic | 85.610 | 94.400 | 90.620 | 94.950 |
| hf/Qwen/Qwen3-4B-Instruct-2507-FP8 | Hugging Face | 121.840 | 101.090 | 100.000 | 93.470 |
Notes:
- Run on all 30 games, max trajectory length 1000, no history in prompt.
- Both claude models were run with
-T use_multiple_choice=true