🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers16

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Diffusion Transformer

Exploring MLLM-Diffusion Information Transfer with MetaCanvas

Intermediate
Han Lin, Xichen Pan et al.Dec 12arXiv

MetaCanvas lets a multimodal language model (MLLM) sketch a plan inside the generator’s hidden canvas so diffusion models can follow it patch by patch.

#MetaCanvas#MLLM#Diffusion Transformer

Composing Concepts from Images and Videos via Concept-prompt Binding

Intermediate
Xianghao Kong, Zeyu Zhang et al.Dec 10arXiv

This paper introduces BiCo, a one-shot way to mix ideas from images and videos by tightly tying each visual idea to the exact words in a prompt.

#BiCo#concept binding#token-level composition

OmniPSD: Layered PSD Generation with Diffusion Transformer

Intermediate
Cheng Liu, Yiren Song et al.Dec 10arXiv

OmniPSD is a new AI that can both make layered Photoshop (PSD) files from words and take apart a flat image into clean, editable layers.

#OmniPSD#layered PSD generation#RGBA-VAE

OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation

Intermediate
Yexin Liu, Manyuan Zhang et al.Dec 9arXiv

OpenSubject is a giant video-based dataset (2.5M samples, 4.35M images) built to help AI make pictures that keep each person or object looking like themselves, even in busy scenes.

#subject-driven generation#identity fidelity#video-derived dataset
12