Agents often act like tourists without a map: they react to what they see now and miss long-term consequences.
This paper asks if large language models (LLMs) can act like "world models" that predict what happens next in text-based environments, not just the next word in a sentence.