πŸŽ“How I Study AIHISA
πŸ“–Read
πŸ“„PapersπŸ“°Blogs🎬Courses
πŸ’‘Learn
πŸ›€οΈPathsπŸ“šTopicsπŸ’‘Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers6

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#multi-hop QA

Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration

Intermediate
Bowei He, Minda Hu et al.Feb 3arXiv

This paper teaches AI to look things up on the web and fix its own mistakes mid-thought instead of starting over from scratch.

#search-integrated reasoning#reinforcement learning#credit assignment

MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

Intermediate
Yaorui Shi, Shugui Liu et al.Jan 29arXiv

MemOCR is a new way for AI to remember long histories by turning important notes into a picture with big, bold parts for key facts and tiny parts for details.

#MemOCR#visual memory#adaptive information density

BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

Intermediate
Shiyu Liu, Yongjing Yin et al.Jan 16arXiv

RL-trained search agents often sound confident even when they don’t know, which can mislead people.

#agentic search#reinforcement learning#boundary awareness

Parallel Context-of-Experts Decoding for Retrieval Augmented Generation

Intermediate
Giulio Corallo, Paolo PapottiJan 13arXiv

This paper introduces PCED, a way to use many documents as separate 'experts' in parallel so an AI can stitch answers together without stuffing everything into one giant prompt.

#Retrieval-Augmented Generation#PCED#contrastive decoding

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Intermediate
Jiajie Zhang, Xin Lv et al.Jan 9arXiv

The paper fixes a big problem in training web-searching AI: rewarding only the final answer makes agents cut corners and sometimes hallucinate.

#deep search agents#reinforcement learning#rubric rewards

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Intermediate
Weizhou Shen, Ziyi Yang et al.Dec 15arXiv

QwenLong-L1.5 is a training recipe that helps AI read and reason over very long documents by improving the data it learns from, the way it is trained, and how it remembers important stuff.

#long-context reasoning#reinforcement learning#GRPO