AI helpers often don’t know new users’ tastes and can’t keep up when those tastes change.
The paper introduces Trainee-Bench, a new way to test AI agents that feels like a real first day at work, with tasks arriving over time, hidden clues, and changing priorities.