Robots often get confused on long, multi-step tasks when they only see the final goal image and try to guess the next move directly.
SpotEdit is a training‑free way to edit only the parts of an image that actually change, instead of re-generating the whole picture.
DreaMontage is a new AI method that makes long, single-shot videos that feel smooth and connected, even when you give it scattered images or short clips in the middle.
This paper protects your photos from being misused by new AI image editors that can copy your face or style from just one picture.
MetaCanvas lets a multimodal language model (MLLM) sketch a plan inside the generator’s hidden canvas so diffusion models can follow it patch by patch.
This paper introduces BiCo, a one-shot way to mix ideas from images and videos by tightly tying each visual idea to the exact words in a prompt.
OmniPSD is a new AI that can both make layered Photoshop (PSD) files from words and take apart a flat image into clean, editable layers.
OpenSubject is a giant video-based dataset (2.5M samples, 4.35M images) built to help AI make pictures that keep each person or object looking like themselves, even in busy scenes.