Papers2

#Autoregressive Generation

Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning

Render-of-Thought (RoT) turns the model’s step-by-step thinking from long text into slim images so the model can think faster with fewer tokens.

#Render-of-Thought#Chain-of-Thought#Latent Reasoning

Not triaged yet

VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction

Intermediate

Sinan Du, Jiahao Guo et al.Nov 28arXiv

VQRAE is a new kind of image tokenizer that lets one model both understand images (continuous features) and generate/reconstruct them (discrete tokens).

#VQRAE#Vector Quantization#Representation Autoencoder

Not triaged yet