Papers8

All Beginner Intermediate Advanced

All Sources arXiv

#LLM judge

Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use

Intermediate

Aradhye Agarwal, Gurdit Siyan et al.Mar 3arXiv

Agentic AIs don’t just chat; they plan, use tools, and take many steps, so one wrong click can cause real harm.

#MOSAIC#agentic safety#plan-check-act

Not triaged yet

MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

Beginner

Zhongxi Wang, Yueqian Lin et al.Mar 3arXiv

MUSE is a new open-source platform that tests how safely AI models behave when you talk to them with text, sound, pictures, and video, not just text.

#MUSE#multimodal safety evaluation#red-teaming

Not triaged yet

Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos

Beginner

Ziqi Gao, Jieyu Zhang et al.Feb 26arXiv

This paper builds a giant, automatically made video library called SVG2 that tells who is in a video, what they look like, and how they interact over time.

#video scene graph#spatio-temporal reasoning#panoptic segmentation

Not triaged yet

Steering LLMs via Scalable Interactive Oversight

Intermediate

Enyu Zhou, Zhiheng Xi et al.Feb 4arXiv

The paper tackles a common problem: people can ask AI to do big, complex tasks, but they can’t always explain exactly what they want or check the results well.

#scalable oversight#interactive alignment#requirement elicitation

Not triaged yet

WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora

Beginner

Pengyu Wang, Benfeng Xu et al.Feb 2arXiv

WildGraphBench is a new test that checks how well GraphRAG systems find and combine facts from messy, real-world web pages.

#GraphRAG#Retrieval-Augmented Generation#Wikipedia references

Not triaged yet

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Intermediate

Zhiyuan Hu, Yucheng Wang et al.Jan 13arXiv

The paper fixes a common problem in training AI reasoners: models get stuck using the same favorite solution style and stop exploring new ways to solve problems.

#Uniqueness-Aware Reinforcement Learning#LLM reasoning#strategy clustering

Not triaged yet

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Intermediate

Jiajie Zhang, Xin Lv et al.Jan 9arXiv

The paper fixes a big problem in training web-searching AI: rewarding only the final answer makes agents cut corners and sometimes hallucinate.

#deep search agents#reinforcement learning#rubric rewards

Not triaged yet

COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs

Intermediate

Dasol Choi, DongGeon Lee et al.Jan 5arXiv

COMPASS is a new framework that turns a company’s rules into thousands of smart test questions to check if chatbots follow those rules.

#policy alignment#allowlist denylist#enterprise AI safety

Not triaged yet