🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

📝Daily Log 🎯Prompts 🧠Review

Search Settings

How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

All Beginner Intermediate Advanced

All Sources arXiv

#cosine similarity

Imagination Helps Visual Reasoning, But Not Yet in Latent Space

You Li, Chi Chen et al.Feb 26arXiv

The paper asks a simple question: do the model’s invisible “imagination tokens” actually help it reason about images?

#multimodal large language model#visual reasoning#latent visual reasoning

Not triaged yet

Multi-Vector Index Compression in Any Modality

Hanxiang Qin, Alexander Martin et al.Feb 24arXiv

Searching through videos, images, and long documents is powerful but gets very expensive when every tiny piece is stored separately.

#multi-vector retrieval#late interaction#index compression

Not triaged yet

LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

Benno Krojer, Shravan Nayak et al.Jan 31arXiv

LatentLens is a simple, training-free way to translate what a model "sees" in image patches into clear words and phrases.

#LatentLens#visual tokens#contextual embeddings

Not triaged yet

IVRA: Improving Visual-Token Relations for Robot Action Policy with Training-Free Hint-Based Guidance

Jongwoo Park, Kanchana Ranasinghe et al.Jan 22arXiv

IVRA is a simple, training-free add-on that helps robot brains keep the 2D shape of pictures while following language instructions.

#Vision-Language-Action#affinity map#training-free guidance

Not triaged yet