How I Study AI - Learn AI Papers & Lectures the Easy Way

Agents of Chaos

Natalie Shapira, Chris Wendler et al.Feb 23arXiv

This paper put real AI agents into a safe, live playground and asked expert testers to mess with them to see what breaks.

#AI agents#red teaming#identity verification

Not triaged yet

AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios

Beginner

Kaiyuan Chen, Qimin Wu et al.Jan 28arXiv

This paper builds a new test called AgentIF-OneDay that checks if AI helpers can follow everyday instructions the way people actually give them.

#AgentIF-OneDay#instruction following#AI agents

Not triaged yet

The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents

Intermediate

Eilam Shapira, Roi Reichart et al.Jan 16arXiv

The paper shows that simply adding a new AI model to the menu—without anyone actually using it—can push a fairness-focused regulator to change the market rules, shifting money from one side to the other.

#Poisoned Apple effect#AI agents#meta-game

Not triaged yet

Papers3

Agents of Chaos

AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios

The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents