This paper argues that true world models are not just sprinkling facts into single tasks, but building a unified system that can see, think, remember, act, and generate across many situations.
DeepPlanning is a new benchmark that tests whether AI can make long, realistic plans that fit time and money limits.