πŸŽ“How I Study AIHISA
πŸ“–Read
πŸ“„PapersπŸ“°Blogs🎬Courses
πŸ’‘Learn
πŸ›€οΈPathsπŸ“šTopicsπŸ’‘Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#LoRA Fine-tuning

Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition

Intermediate
Jinlong Ma, Yu Zhang et al.Feb 4arXiv

The paper teaches multimodal large language models (MLLMs) to stop guessing from just text or just images and instead check both together before answering.

#GMNER#Multimodal Large Language Models#Modality Bias

ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought

Intermediate
Fanmeng Wang, Haotian Liu et al.Jan 30arXiv

Chain-of-Thought (CoT) makes AI think step by step, but it is slow because it writes many tokens one by one.

#Chain-of-Thought#Latent Reasoning#Variational Auto-Encoder

Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning

Intermediate
Yifan Wang, Shiyu Li et al.Jan 21arXiv

Render-of-Thought (RoT) turns the model’s step-by-step thinking from long text into slim images so the model can think faster with fewer tokens.

#Render-of-Thought#Chain-of-Thought#Latent Reasoning