Papers3

#world models

From Perception to Action: An Interactive Benchmark for Vision Reasoning

The paper introduces CHAIN, a hands-on 3D playground that tests if AI can not only see objects but also plan and act under real physics.

#interactive benchmark#vision-language models#physical reasoning

Not triaged yet

WorldCompass: Reinforcement Learning for Long-Horizon World Models

Beginner

Zehan Wang, Tengfei Wang et al.Feb 9arXiv

WorldCompass teaches video world models to follow actions better and keep pictures pretty by using reinforcement learning after pretraining.

#world models#reinforcement learning#clip-level rollout

Not triaged yet

Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals

Beginner

Nate Gillman, Yinghua Zhou et al.Jan 9arXiv

Video models can now be told what physical result you want (like “make this ball move left with a strong push”) using Goal Force, instead of just vague text or a final picture.

#goal force#force vector control#visual planning

Not triaged yet