Papers3

All Beginner Intermediate Advanced

All Sources arXiv

#latent tokens

Imagination Helps Visual Reasoning, But Not Yet in Latent Space

Beginner

You Li, Chi Chen et al.Feb 26arXiv

The paper asks a simple question: do the model’s invisible “imagination tokens” actually help it reason about images?

#multimodal large language model#visual reasoning#latent visual reasoning

Not triaged yet

Latent Thoughts Tuning: Bridging Context and Reasoning with Fused Information in Latent Tokens

Intermediate

Weihao Liu, Dehai Min et al.Feb 10arXiv

The paper introduces LT-Tuning, a way for AI models to “think silently” using special hidden tokens instead of writing every step out loud.

#latent tokens#chain-of-thought#context-prediction fusion

Not triaged yet

Latent Implicit Visual Reasoning

Intermediate

Kelvin Li, Chuyi Shang et al.Dec 24arXiv

Large Multimodal Models (LMMs) are great at reading text and looking at pictures, but they usually do most of their thinking in words, which limits deep visual reasoning.

#Latent Implicit Visual Reasoning#latent tokens#bottleneck attention masking

Not triaged yet