🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers38

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#chain-of-thought

Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners

Beginner
Yihong Liu, Raoyuan Zhao et al.Jan 6arXiv

Large reasoning models can often find the right math answer in their “head” before finishing their written steps, but this works best in languages with lots of training data like English and Chinese.

#latent reasoning#chain-of-thought#multilingual LLMs

CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving

Intermediate
Shuhang Chen, Yunqiu Xu et al.Jan 5arXiv

This paper teaches AI to solve diagram-based math problems by copying how people think: first see (perception), then make sense of what you saw (internalization), and finally reason (solve the problem).

#visual mathematical reasoning#multimodal large language models#perception-reasoning alignment

Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

Beginner
Zhenyu Zhang, Shujian Zhang et al.Dec 30arXiv

This paper shows a new way (called RISE) to find and control how AI models think without needing any human-made labels.

#RISE#sparse auto-encoder#reasoning vectors

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Intermediate
Hao Liang, Xiaochen Ma et al.Dec 18arXiv

DataFlow is a building-block system that helps large language models get better data by unifying how we create, clean, check, and organize that data.

#DataFlow#LLM data preparation#operator pipeline

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Intermediate
Yuxin Wang, Lei Ke et al.Dec 18arXiv

This paper teaches a vision-language model to first find objects in real 3D space (not just 2D pictures) and then reason about where things are.

#3D grounding#vision-language models#spatial reasoning

Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning

Intermediate
Yifei Li, Wenzhao Zheng et al.Dec 17arXiv

Skyra is a detective-style AI that spots tiny visual mistakes (artifacts) in videos to tell if they are real or AI-generated, and it explains its decision with times and places in the video.

#AI-generated video detection#artifact reasoning#multimodal large language model

Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision

Intermediate
Wei Du, Shubham Toshniwal et al.Dec 17arXiv

Nemotron-Math is a giant math dataset with 7.5 million step-by-step solutions created in three thinking styles and with or without Python help.

#mathematical reasoning#long-context fine-tuning#multi-mode supervision

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

Intermediate
Mengzhang Cai, Xin Gao et al.Dec 16arXiv

OpenDataArena (ODA) is a fair, open platform that measures how valuable different post‑training datasets are for large language models by holding everything else constant.

#OpenDataArena#post-training datasets#data-centric AI

State over Tokens: Characterizing the Role of Reasoning Tokens

Intermediate
Mosh Levy, Zohar Elyoseph et al.Dec 14arXiv

Reasoning tokens (the words a model writes before its final answer) help the model think better, but they are not a trustworthy diary of how it really thought.

#State over Tokens#reasoning tokens#chain-of-thought

DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry

Intermediate
Zhenyang Cai, Jiaming Zhang et al.Dec 12arXiv

DentalGPT is a special AI that looks at dental images and text together and explains what it sees like a junior dentist.

#DentalGPT#multimodal large language model#dentistry AI

Rethinking Chain-of-Thought Reasoning for Videos

Intermediate
Yiwu Zhong, Zi-Yuan Hu et al.Dec 10arXiv

The paper shows that video AIs do not need long, human-like chains of thought to reason well.

#video reasoning#chain-of-thought#concise reasoning

VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning

Intermediate
Yuji Wang, Wenlong Liu et al.Dec 6arXiv

VG-Refiner is a new way for AI to find the right object in a picture when given a description, even if helper tools make mistakes.

#visual grounding#referring expression comprehension#tool-integrated visual reasoning
1234