🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers7

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#LLM judge

Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use

Intermediate
Aradhye Agarwal, Gurdit Siyan et al.Mar 3arXiv

Agentic AIs don’t just chat; they plan, use tools, and take many steps, so one wrong click can cause real harm.

#MOSAIC#agentic safety#plan-check-act

Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos

Beginner
Ziqi Gao, Jieyu Zhang et al.Feb 26arXiv

This paper builds a giant, automatically made video library called SVG2 that tells who is in a video, what they look like, and how they interact over time.

#video scene graph#spatio-temporal reasoning#panoptic segmentation

Steering LLMs via Scalable Interactive Oversight

Intermediate
Enyu Zhou, Zhiheng Xi et al.Feb 4arXiv

The paper tackles a common problem: people can ask AI to do big, complex tasks, but they can’t always explain exactly what they want or check the results well.

#scalable oversight#interactive alignment#requirement elicitation

WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora

Beginner
Pengyu Wang, Benfeng Xu et al.Feb 2arXiv

WildGraphBench is a new test that checks how well GraphRAG systems find and combine facts from messy, real-world web pages.

#GraphRAG#Retrieval-Augmented Generation#Wikipedia references

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Intermediate
Zhiyuan Hu, Yucheng Wang et al.Jan 13arXiv

The paper fixes a common problem in training AI reasoners: models get stuck using the same favorite solution style and stop exploring new ways to solve problems.

#Uniqueness-Aware Reinforcement Learning#LLM reasoning#strategy clustering

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Intermediate
Jiajie Zhang, Xin Lv et al.Jan 9arXiv

The paper fixes a big problem in training web-searching AI: rewarding only the final answer makes agents cut corners and sometimes hallucinate.

#deep search agents#reinforcement learning#rubric rewards

COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs

Intermediate
Dasol Choi, DongGeon Lee et al.Jan 5arXiv

COMPASS is a new framework that turns a company’s rules into thousands of smart test questions to check if chatbots follow those rules.

#policy alignment#allowlist denylist#enterprise AI safety