Papers2

#Classifier-free guidance

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Leon Liangyu Chen, Haoyu Ma et al.Feb 12arXiv

UniT teaches one multimodal model to think in steps with pictures and words, so it can check its own work and fix mistakes as it goes.

#Unified multimodal model#Chain-of-thought#Test-time scaling

Not triaged yet

Self-Evaluation Unlocks Any-Step Text-to-Image Generation

Intermediate

Xin Yu, Xiaojuan Qi et al.Dec 26arXiv

This paper introduces Self-E, a text-to-image model that learns from scratch and can generate good pictures in any number of steps, from just a few to many.

#Self-Evaluating Model#Any-step inference#Text-to-image generation

Not triaged yet