Papers3

#robot policy learning

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Moo Jin Kim, Yihuai Gao et al.Jan 22arXiv

Cosmos Policy teaches robots to act by fine-tuning a powerful video model in just one training stage, without changing the model’s architecture.

#video diffusion#robot policy learning#visuomotor control

Not triaged yet

Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow

Intermediate

Karthik Dharmarajan, Wenlong Huang et al.Dec 31arXiv

Dream2Flow lets a robot watch a short, AI-generated video of a task and then do that task in real life by following object motion in 3D.

#3D object flow#video generation for robotics#open-world manipulation

Not triaged yet

SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling

Intermediate

Yufan He, Pengfei Guo et al.Dec 29arXiv

SurgWorld teaches surgical robots using videos plus text, then guesses the missing robot moves so we can train good policies without collecting tons of real robot-action data.

#surgical world model#SATA dataset#inverse dynamics model

Not triaged yet