This paper checks how safe a real, tool-using AI agent called Clawdbot (OpenClaw) is by watching every step it takes during tasks, not just the final answer.
Multi-agent systems are like teams of expert helpers; the tricky part is choosing which helpers to ask for each question.