This paper explains how to turn large language models (LLMs) from quiet students that only answer questions into active agents that can plan, act, and learn over time.
ET-Agent is a training framework that teaches AI agents to use tools (like search and code) more wisely, not just to get the right answer.
Machine learning agents usually improve by writing code, running it for hours, and then using the results to tweak the next try, which is very slow.
AT2PO is a new way to train AI agents that work in several turns, like asking the web a question, reading the result, and trying again.
This paper teaches AI agents to learn new reusable skills and get better over time by using reinforcement learning, not just prompts.