This paper put real AI agents into a safe, live playground and asked expert testers to mess with them to see what breaks.
This paper checks how safe a real, tool-using AI agent called Clawdbot (OpenClaw) is by watching every step it takes during tasks, not just the final answer.
Agent skills are like apps for AI helpers, but many of them are not carefully checked for safety yet.