Papers28

#Chain-of-Thought

Forest Before Trees: Latent Superposition for Efficient Visual Reasoning

Yubo Wang, Juntian Zhang et al.Jan 11arXiv

This paper introduces Laser, a new way for vision-language models to think in their hidden space before speaking, so they see the whole “forest” before picking out the “trees.”

#Latent reasoning#Dynamic Windowed Alignment Learning#Dynamic Semantic Windows

EpiCaR: Knowing What You Don't Know Matters for Better Reasoning in LLMs

Intermediate

Jewon Yeom, Jaewon Sok et al.Jan 11arXiv

This paper teaches AI models not just how to solve problems but also how to tell when their own answers might be wrong.

#EPICAR#calibration#epistemic uncertainty

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Intermediate

Shidong Cao, Hongzhan Lin et al.Jan 7arXiv

DiffCoT treats a model’s step-by-step thinking (Chain-of-Thought) like a messy draft that can be cleaned up over time, not something fixed forever.

#Chain-of-Thought#Diffusion models#Autoregressive decoding

Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy

Intermediate

Hosein Hasani, Mohammadali Banayeeanzade et al.Jan 6arXiv

Large language models (LLMs) are good at many math problems but often mess up simple counting when the list gets long.

#mechanistic interpretability#counting in LLMs#System-2 prompting

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

Intermediate

Huichao Zhang, Liao Qu et al.Jan 5arXiv

NextFlow is a single, decoder-only Transformer that can read and write both text and images in one continuous sequence.

#Next-Scale Prediction#Autoregressive Transformer#Dual-Codebook Tokenization

Multi-hop Reasoning via Early Knowledge Alignment

Intermediate

Yuxin Wang, Shicheng Fang et al.Dec 23arXiv

This paper adds a tiny but powerful step called Early Knowledge Alignment (EKA) to multi-step retrieval systems so the model takes a quick, smart look at relevant information before it starts planning.

#Retrieval-Augmented Generation#Iterative RAG#Multi-hop Reasoning

Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

Intermediate

Ming Li, Chenrui Fan et al.Dec 23arXiv

This paper turns messy chains of thought from language models into clear, named steps so we can see how they really think through math problems.

#Schoenfeld’s Episode Theory#Cognitive Episodes#ThinkARM

When Reasoning Meets Its Laws

Intermediate

Junyu Zhang, Yifan Sun et al.Dec 19arXiv

The paper proposes the Laws of Reasoning (LORE), simple rules that say how much a model should think and how accurate it can be as problems get harder.

#Large Reasoning Models#Laws of Reasoning#Compute Law

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Intermediate

Tianshuai Hu, Xiaolu Liu et al.Dec 18arXiv

Traditional self-driving used separate boxes for seeing, thinking, and acting, but tiny mistakes in early boxes could snowball into big problems later.

#Vision-Language-Action#End-to-End Autonomous Driving#Dual-System VLA

An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges

Intermediate

Chao Xu, Suyu Zhang et al.Dec 12arXiv

Vision-Language-Action (VLA) models are robots’ “see–think–do” brains that connect cameras (vision), words (language), and motors (action).

#Vision-Language-Action#Embodied AI#Multimodal Alignment

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Intermediate

Zijian Wu, Lingkai Kong et al.Dec 11arXiv

Big AI models often write very long step-by-step solutions, but usual checkers either only check the final answer or get lost in the long steps.

#Outcome-based Process Verifier#Chain-of-Thought#Process Verification

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Intermediate

Songyang Gao, Yuzhe Gu et al.Dec 11arXiv

This paper builds a math problem–solving agent, Intern-S1-MO, that thinks in multiple rounds and remembers proven mini-results called lemmas so it can solve very long, Olympiad-level problems.

#long-horizon reasoning#lemma-based memory#multi-agent reasoning

1 2 3