Imagination Helps Visual Reasoning, But Not Yet in Latent Space
BeginnerYou Li, Chi Chen et al.Feb 26arXiv
The paper asks a simple question: do the model’s invisible “imagination tokens” actually help it reason about images?
#multimodal large language model#visual reasoning#latent visual reasoning