๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐ŸงฉProblems๐ŸŽฏPrompts๐Ÿง Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers5

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Classifier-Free Guidance

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

Intermediate
Bozhou Li, Yushuo Guan et al.Feb 3arXiv

The paper shows that using information from many layers of a language model (not just one) helps text-to-image diffusion transformers follow prompts much better.

#Diffusion Transformer#Text Conditioning#Multi-layer LLM Features

Balancing Understanding and Generation in Discrete Diffusion Models

Intermediate
Yue Liu, Yuzhong Zhao et al.Feb 1arXiv

This paper introduces XDLM, a single model that blends two popular diffusion styles (masked and uniform) so it both understands and generates text and images well.

#XDLM#discrete diffusion#stationary noise kernel

StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models

Intermediate
Senmao Li, Kai Wang et al.Dec 18arXiv

StageVAR makes image-generating AI much faster by recognizing that early steps set the meaning and structure, while later steps just polish details.

#Visual Autoregressive Modeling#Next-Scale Prediction#Stage-Aware Acceleration

RecTok: Reconstruction Distillation along Rectified Flow

Intermediate
Qingyu Shi, Size Wu et al.Dec 15arXiv

RecTok is a new visual tokenizer that teaches the whole training path of a diffusion model (the forward flow) to be smart about image meaning, not just the starting latent features.

#Rectified Flow#Flow Matching#Visual Tokenizer

Bidirectional Normalizing Flow: From Data to Noise and Back

Intermediate
Yiyang Lu, Qiao Sun et al.Dec 11arXiv

Normalizing Flows are models that learn how to turn real images into simple noise and then back again.

#Normalizing Flow#Bidirectional Normalizing Flow#Hidden Alignment