Papers2

#visual planning

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning

Chengzu Li, Zanyi Wang et al.Jan 28arXiv

This paper shows that making short videos can help AI plan and reason in pictures better than writing out steps in text.

#video reasoning#visual planning#test-time scaling

Not triaged yet

Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals

Beginner

Nate Gillman, Yinghua Zhou et al.Jan 9arXiv

Video models can now be told what physical result you want (like “make this ball move left with a strong push”) using Goal Force, instead of just vague text or a final picture.

#goal force#force vector control#visual planning

Not triaged yet