The paper introduces CHAIN, a hands-on 3D playground that tests if AI can not only see objects but also plan and act under real physics.
SAW-Bench is a new test that checks if AI can understand the world from a first-person view, like wearing smart glasses.
This paper asks a simple question with big consequences: can today’s AI models actively explore a new space and build a trustworthy internal map of it?
This paper argues that true world models are not just sprinkling facts into single tasks, but building a unified system that can see, think, remember, act, and generate across many situations.
Robots need videos that not only look pretty but also follow real-world physics and finish the task asked of them.
TwinBrainVLA is a robot brain with two halves: a frozen generalist that keeps world knowledge safe and a trainable specialist that learns to move precisely.
WorldMind teaches AI agents to learn the rules of the real world while they act, instead of cramming everything into fixed model weights.
Fast-ThinkAct teaches a robot to plan with a few tiny hidden "thought tokens" instead of long paragraphs, making it much faster while staying smart.
Real life directions are often vague, so the paper creates a task where a robot can ask questions while it searches for a very specific object in a big house.
Robots learn best from what they would actually see, which is a first-person (egocentric) view, but most AI models are trained on third-person videos and get confused.
This paper shows how to make home-helper robots better at long, multi-step chores by smart training on diverse tasks and by polishing the model after training using its own best attempts.