Papers6

#Text-to-Image

CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance

Hanyang Wang, Yiyang Liu et al.Mar 3arXiv

This paper turns a popular image-guidance trick (Classifier-Free Guidance) into a feedback-control problem, just like keeping a car steady in its lane.

#Classifier-Free Guidance#Sliding Mode Control#Diffusion Models

Not triaged yet

DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers

Intermediate

Dahye Kim, Deepti Ghadiyaram et al.Feb 19arXiv

This paper speeds up image and video generators called diffusion transformers by changing how big their puzzle pieces (patches) are at each step.

#Diffusion Transformer#Dynamic Tokenization#Patch Scheduling

Not triaged yet

HY3D-Bench: Generation of 3D Assets

Intermediate

Team Hunyuan3D, : et al.Feb 3arXiv

HY3D-Bench is a complete, open-source “starter kit” for making and studying high-quality 3D objects.

#HY3D-Bench#watertight meshes#part-level decomposition

Not triaged yet

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Intermediate

Shengbang Tong, Boyang Zheng et al.Jan 22arXiv

Before this work, most text-to-image models used VAEs (small, squished image codes) and struggled with slow training and overfitting on high-quality fine-tuning sets.

#Representation Autoencoder#RAE#Variational Autoencoder

Not triaged yet

SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices

Intermediate

Dongting Hu, Aarush Gupta et al.Jan 13arXiv

This paper shows how to make powerful image‑generating Transformers run fast on phones without needing the cloud.

#Diffusion Transformer#Sparse Attention#Adaptive Sparse Self-Attention

Not triaged yet

Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

Beginner

Shilong Zhang, He Zhang et al.Dec 19arXiv

This paper shows that great image understanding features alone are not enough for making great images; you also need strong pixel-level detail.

#Pixel–Semantic VAE#Semantic Regularization#Off-Manifold Generation

Not triaged yet