🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers14

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#tool use

Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents

Intermediate
Changdae Oh, Seongheon Park et al.Feb 4arXiv

This paper says we should measure an AI agent’s uncertainty across its whole conversation, not just on one final answer.

#uncertainty quantification#LLM agents#interactive AI

Kimi K2.5: Visual Agentic Intelligence

Beginner
Kimi Team, Tongtong Bai et al.Feb 2arXiv

Kimi K2.5 is a new open-source AI that can read both text and visuals (images and videos) and act like a team of helpers to finish big tasks faster.

#multimodal learning#vision-language models#joint optimization

D-CORE: Incentivizing Task Decomposition in Large Reasoning Models for Complex Tool Use

Intermediate
Bowen Xu, Shaoyu Wu et al.Feb 2arXiv

This paper fixes a common problem in reasoning AIs called Lazy Reasoning, where the model rambles instead of making a good plan.

#task decomposition#tool use#large reasoning models

daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently

Intermediate
Mohan Jiang, Dayuan Fu et al.Feb 2arXiv

Long tasks trip up most AIs because they lose track of goals and make small mistakes that snowball over many steps.

#long-horizon agency#pull request chains#software evolution

CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty

Intermediate
Johannes Kirmayr, Lukas Stappen et al.Jan 29arXiv

CAR-bench is a new 'driving test' for AI assistants that checks if they can stay careful, honest, and consistent during real back-and-forth conversations in a car.

#LLM agents#benchmarking#consistency

Robust Tool Use via Fission-GRPO: Learning to Recover from Execution Errors

Beginner
Zhiwei Zhang, Fei Zhao et al.Jan 22arXiv

Small AI models often stumble when a tool call fails and then get stuck repeating bad calls instead of fixing the mistake.

#FISSION-GRPO#error recovery#tool use

User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale

Intermediate
Jungho Cho, Minbyul Jeong et al.Jan 13arXiv

The paper builds a new way to create realistic, long conversations between people and AI that use tools like databases.

#multi-turn dialogue generation#tool use#user simulation

ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking

Beginner
Qiang Zhang, Boli Chen et al.Jan 10arXiv

ArenaRL teaches AI agents by comparing their answers against each other, like a sports tournament, instead of giving each answer a single noisy score.

#ArenaRL#reinforcement learning#relative ranking

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Beginner
Yuxiang Ji, Yong Wang et al.Jan 8arXiv

The paper teaches an AI to act like a careful traveler: it looks at a photo, forms guesses about where it might be, and uses real map tools to check each guess.

#image geolocalization#map-augmented agent#Thinking with Map

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Intermediate
Junru Lu, Jiarui Qin et al.Dec 31arXiv

Youtu-LLM is a small (1.96B) language model that was trained from scratch to think, plan, and act like an agent instead of just copying bigger models.

#lightweight LLM#agentic mid-training#trajectory data

SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning

Intermediate
Yong Xien Chng, Tao Hu et al.Dec 30arXiv

SenseNova-MARS is a vision-language model that can think step-by-step and use three tools—text search, image search, and image cropping—during its reasoning.

#multimodal agent#vision-language model#reinforcement learning

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Intermediate
Jiacheng Guo, Ling Yang et al.Dec 22arXiv

GenEnv is a training system where a student AI and a teacher simulator grow together by exchanging tasks and feedback.

#GenEnv#co-evolutionary learning#difficulty-aligned curriculum
12