Papers6

#Classifier-Free Guidance

CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance

Hanyang Wang, Yiyang Liu et al.Mar 3arXiv

This paper turns a popular image-guidance trick (Classifier-Free Guidance) into a feedback-control problem, just like keeping a car steady in its lane.

#Classifier-Free Guidance#Sliding Mode Control#Diffusion Models

Not triaged yet

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

Intermediate

Bozhou Li, Yushuo Guan et al.Feb 3arXiv

The paper shows that using information from many layers of a language model (not just one) helps text-to-image diffusion transformers follow prompts much better.

#Diffusion Transformer#Text Conditioning#Multi-layer LLM Features

Not triaged yet

Balancing Understanding and Generation in Discrete Diffusion Models

Intermediate

Yue Liu, Yuzhong Zhao et al.Feb 1arXiv

This paper introduces XDLM, a single model that blends two popular diffusion styles (masked and uniform) so it both understands and generates text and images well.

#XDLM#discrete diffusion#stationary noise kernel

Not triaged yet

StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models

Intermediate

Senmao Li, Kai Wang et al.Dec 18arXiv

StageVAR makes image-generating AI much faster by recognizing that early steps set the meaning and structure, while later steps just polish details.

#Visual Autoregressive Modeling#Next-Scale Prediction#Stage-Aware Acceleration

Not triaged yet

RecTok: Reconstruction Distillation along Rectified Flow

Intermediate

Qingyu Shi, Size Wu et al.Dec 15arXiv

RecTok is a new visual tokenizer that teaches the whole training path of a diffusion model (the forward flow) to be smart about image meaning, not just the starting latent features.

#Rectified Flow#Flow Matching#Visual Tokenizer

Not triaged yet

Bidirectional Normalizing Flow: From Data to Noise and Back

Intermediate

Yiyang Lu, Qiao Sun et al.Dec 11arXiv

Normalizing Flows are models that learn how to turn real images into simple noise and then back again.

#Normalizing Flow#Bidirectional Normalizing Flow#Hidden Alignment

Not triaged yet