This paper teaches AI to learn how-to steps from demonstrations in the moment, the way people do.
Robust-R1 teaches vision-language models to notice how a picture is damaged, think through what that damage hides, and then answer as if the picture were clear.