Papers3

#prompt injection

Agents of Chaos

Natalie Shapira, Chris Wendler et al.Feb 23arXiv

This paper put real AI agents into a safe, live playground and asked expert testers to mess with them to see what breaks.

#AI agents#red teaming#identity verification

Not triaged yet

A Trajectory-Based Safety Audit of Clawdbot (OpenClaw)

Beginner

Tianyu Chen, Dongrui Liu et al.Feb 16arXiv

This paper checks how safe a real, tool-using AI agent called Clawdbot (OpenClaw) is by watching every step it takes during tasks, not just the final answer.

#trajectory-centric safety#tool-using AI agents#prompt injection

Not triaged yet

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

Beginner

Yi Liu, Weizhe Wang et al.Jan 15arXiv

Agent skills are like apps for AI helpers, but many of them are not carefully checked for safety yet.

#agent skills#AI security#prompt injection

Not triaged yet