🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

📝Daily Log 🎯Prompts 🧠Review

Search Settings

How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers5

All Beginner Intermediate Advanced

All Sources arXiv

#tool-integrated reasoning

Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

Emre Can Acikgoz, Cheng Qian et al.Feb 24arXiv

Tool-R0 teaches a language model to use software tools (like APIs) with zero human-made training data.

#self-play reinforcement learning#tool calling#function calling

Not triaged yet

Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification

Chuxue Cao, Jinluan Yang et al.Jan 30arXiv

Large language models sometimes reach the right answer for the wrong reasons, which is risky and confusing.

#formal logic verification#interleaved verification#neuro-symbolic reasoning

Not triaged yet

BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

Shiyu Liu, Yongjing Yin et al.Jan 16arXiv

RL-trained search agents often sound confident even when they don’t know, which can mislead people.

#agentic search#reinforcement learning#boundary awareness

Not triaged yet

Nested Browser-Use Learning for Agentic Information Seeking

Baixuan Li, Jialong Wu et al.Dec 29arXiv

This paper teaches AI helpers to browse the web more like people do, not just by grabbing static snippets.

#information-seeking agents#browser-use#ReAct function-calling

Not triaged yet

Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision

Wei Du, Shubham Toshniwal et al.Dec 17arXiv

Nemotron-Math is a giant math dataset with 7.5 million step-by-step solutions created in three thinking styles and with or without Python help.

#mathematical reasoning#long-context fine-tuning#multi-mode supervision

Not triaged yet