This paper shows that making short videos can help AI plan and reason in pictures better than writing out steps in text.
Video models can now be told what physical result you want (like โmake this ball move left with a strong pushโ) using Goal Force, instead of just vague text or a final picture.