🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers17

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#GenEval

Few-Step Distillation for Text-to-Image Generation: A Practical Guide

Intermediate
Yifan Pu, Yizeng Han et al.Dec 15arXiv

Big text-to-image models make amazing pictures but are slow because they take hundreds of tiny steps to turn noise into an image.

#text-to-image#diffusion models#few-step generation

Exploring MLLM-Diffusion Information Transfer with MetaCanvas

Intermediate
Han Lin, Xichen Pan et al.Dec 12arXiv

MetaCanvas lets a multimodal language model (MLLM) sketch a plan inside the generator’s hidden canvas so diffusion models can follow it patch by patch.

#MetaCanvas#MLLM#Diffusion Transformer

EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

Intermediate
Xin He, Longhui Wei et al.Dec 4arXiv

EMMA is a single AI model that can understand images, write about them, create new images from text, and edit images—all in one unified system.

#EMMA#unified multimodal architecture#32x autoencoder

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Intermediate
Zhenglin Cheng, Peng Sun et al.Dec 3arXiv

TwinFlow is a new way to make big image models draw great pictures in just one step instead of 40–100 steps.

#TwinFlow#one-step generation#twin trajectories

Visual Generation Tuning

Intermediate
Jiahao Guo, Sinan Du et al.Nov 28arXiv

Before this work, big vision-language models (VLMs) were great at understanding pictures and words together but not at making new pictures.

#Visual Generation Tuning#VGT-AE#Vision-Language Models
12