The paper teaches a game-playing AI to copy good human players (behavior cloning) and shows that simply scaling up the model and the data makes the AI reason more causally (it pays attention to what truly causes outcomes on screen).
SurgWorld teaches surgical robots using videos plus text, then guesses the missing robot moves so we can train good policies without collecting tons of real robot-action data.