This paper builds a new test called AgentIF-OneDay that checks if AI helpers can follow everyday instructions the way people actually give them.
The paper shows that simply adding a new AI model to the menu—without anyone actually using it—can push a fairness-focused regulator to change the market rules, shifting money from one side to the other.