This paper put real AI agents into a safe, live playground and asked expert testers to mess with them to see what breaks.
This paper checks how safe a real, tool-using AI agent called Clawdbot (OpenClaw) is by watching every step it takes during tasks, not just the final answer.
AgentDoG is a new ‘diagnostic guardrail’ that watches AI agents step-by-step and explains exactly why a risky action happened.
Agent skills are like apps for AI helpers, but many of them are not carefully checked for safety yet.
ToolSafe is a new way to keep AI agents safe when they use external tools, by checking each action before it runs.
FinVault is a new test that checks if AI helpers for finance stay safe while actually doing real jobs, not just chatting.