MCPTox: Tool Poisoning Attacks on Real-World MCP Servers
MCPTox measures whether tool-using LLM agents are manipulated by Tool Poisoning Attacks, where a malicious instruction is hidden inside a tool’s description (the metadata an agent reads when planning) rather than in any executed code. Built on 45 live MCP servers and 353 authentic tools, it presents an agent with a benign user query and a server whose tool set contains one poisoned tool, and measures the Attack Success Rate via a model judge.
Overview
⚠️ External evaluation. Code lives in an upstream repository. inspect_evals lists it for discoverability; review the upstream repo and pinned commit before running.
Source: stefanoamorelli/inspect-evals-mcptox@d45705b
MCPTox measures whether tool-using LLM agents are manipulated by Tool Poisoning Attacks, where a malicious instruction is hidden inside a tool’s description (the metadata an agent reads when planning) rather than in any executed code. Built on 45 live MCP servers and 353 authentic tools, it presents an agent with a benign user query and a server whose tool set contains one poisoned tool, and measures the Attack Success Rate via a model judge.
Usage
Installation
This is an externally-maintained evaluation. Clone the upstream repository at the pinned commit and install its dependencies:
git clone https://github.com/stefanoamorelli/inspect-evals-mcptox
cd inspect-evals-mcptox
git checkout d45705b0a7ae6697c851e311187b06bf7488b13f
uv syncRunning evaluations
CLI
uv run inspect eval src/inspect_evals_mcptox/mcptox.py@mcptox --model openai/gpt-5-nanoPython
from inspect_ai import eval
from inspect_evals_mcptox.mcptox import mcptox
eval(mcptox(), model="openai/gpt-5-nano")View logs
uv run inspect viewMore information
For the dataset, scorer, task parameters, and validation, see the upstream repo: stefanoamorelli/inspect-evals-mcptox.
Options
You can control a variety of options from the command line. For example:
uv run inspect eval src/inspect_evals_mcptox/mcptox.py@mcptox --limit 10
uv run inspect eval src/inspect_evals_mcptox/mcptox.py@mcptox --max-connections 10
uv run inspect eval src/inspect_evals_mcptox/mcptox.py@mcptox --temperature 0.5See uv run inspect eval --help for all available options.
More command-line options: Inspect docs ↗