SurgWorld teaches surgical robots using videos plus text, then guesses the missing robot moves so we can train good policies without collecting tons of real robot-action data.
The paper teaches a video generator to move things realistically by borrowing motion knowledge from a strong video tracker.