This paper teaches talking avatars not just to speak, but to look around their scene and handle nearby objects exactly as a text instruction says.
This paper shows how to make a whole picture in one go, directly in pixels, without using a hidden βlatentβ space or many tiny steps.
ThinkRL-Edit teaches an image editor to think first and draw second, which makes tricky, reasoning-heavy edits much more accurate.