Large language models are great at words, but they struggle to predict what will happen after they act in a changing world.
SPARK is a new way to train AI agents that saves compute by exploring more only at the most important moments.
LLM agents are usually trained in a few worlds but asked to work in many different, unseen worlds, which often hurts their performance.
Agents often act like tourists without a map: they react to what they see now and miss long-term consequences.
This paper asks if large language models (LLMs) can act like "world models" that predict what happens next in text-based environments, not just the next word in a sentence.
This paper introduces LAMER, a Meta-RL training framework that teaches language agents to explore first and then use what they learned to solve tasks faster.
GTR-Turbo teaches a vision-language agent using a 'free teacher' made by merging its own past checkpoints, so no costly external model is needed.