This paper teaches long-horizon AI agents to remember everything exactly without stuffing their whole memory at once.
Tool-R0 teaches a language model to use software tools (like APIs) with zero human-made training data.
AI helpers often don’t know new users’ tastes and can’t keep up when those tastes change.
AgenticPay is a safe playground where AI agents practice buying and selling by talking, not just by typing numbers.
RLAnything is a new reinforcement learning (RL) framework that trains three things together at once: the policy (the agent), the reward model (the judge), and the environment (the tasks).
LLM agents are usually trained in a few worlds but asked to work in many different, unseen worlds, which often hurts their performance.
AI agents often act very sure of themselves even when they are wrong, especially on long, multi-step tasks.
This paper studies how AI agents that use tools talk about how sure they are and finds a split: some tools make them too sure, others help them be honest.
MemEvolve teaches AI agents not only to remember past experiences but also to improve the way they remember, like a student who upgrades their study habits over time.
SCOPE lets AI agents rewrite their own instructions while they are working, so they can fix mistakes and get smarter on the next step, not just the next task.