RIVER Bench is a new test that checks how well AI can watch a video stream and talk with you in real time.
MemSifter is a smart helper that picks the right memories for a big AI so the big AI doesnβt have to read everything.
MemGUI-Bench is a new test that checks how well phone-controlling AI agents can remember important information both during a task and across different tries.
This paper argues that true world models are not just sprinkling facts into single tasks, but building a unified system that can see, think, remember, act, and generate across many situations.