Helios is a 14-billion-parameter video model that can make minute-long videos in real time at about 19.5 frames per second on a single NVIDIA H100 GPU.
MIND is a new benchmark that fairly tests two core skills of world models: remembering the world over time (memory consistency) and following controls exactly (action control).
The paper fixes a common problem in video world models: scenes slowly change or βdriftβ when the camera moves and comes back.
Robots used to copy actions from videos without truly understanding how the world changes, so they often messed up long, multi-step jobs.
MemFlow is a new way for AI to remember the right parts of a long video story while it keeps making new parts, so characters and scenes stay consistent.
LongVie 2 is a video world model that can generate controllable videos for 3β5 minutes while keeping the look and motion steady over time.