🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers18

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#test-time scaling

Reinforced Attention Learning

Intermediate
Bangzheng Li, Jianmo Ni et al.Feb 4arXiv

This paper teaches AI to pay attention better by training its focus, not just its words.

#Reinforced Attention Learning#attention policy#multimodal LLM

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

Intermediate
Tong Zheng, Chengsong Huang et al.Feb 3arXiv

Parallel-Probe is a simple add-on that lets many AI “thought paths” think at once but stop early when they already agree.

#parallel thinking#2D probing#consensus-based early stopping

SWE-World: Building Software Engineering Agents in Docker-Free Environments

Intermediate
Shuang Sun, Huatong Song et al.Feb 3arXiv

SWE-World lets code-fixing AI agents practice and learn without heavy Docker containers by using smart models that pretend to be the computer and tests.

#SWE-World#software engineering agents#Docker-free training

RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents

Intermediate
Jialiang Zhu, Gongrui Zhang et al.Feb 2arXiv

Re-TRAC is a new way for AI search agents to learn from each try, write a clean summary of what happened, and then use that summary to do better on the next try.

#Re-TRAC#trajectory compression#deep research agents

CoDiQ: Test-Time Scaling for Controllable Difficult Question Generation

Intermediate
Zhongyuan Peng, Caijun Xu et al.Feb 2arXiv

CoDiQ is a recipe for making hard-but-solvable math and coding questions on purpose, and it controls how hard they get while you generate them.

#controllable difficulty#test-time scaling#question generation

FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents

Intermediate
Chiwei Zhu, Benfeng Xu et al.Feb 2arXiv

FS-Researcher is a two-agent system that lets AI do very long research by saving everything in a computer folder so it never runs out of memory.

#FS-Researcher#file-system agents#external memory

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning

Intermediate
Chengzu Li, Zanyi Wang et al.Jan 28arXiv

This paper shows that making short videos can help AI plan and reason in pictures better than writing out steps in text.

#video reasoning#visual planning#test-time scaling

Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification

Intermediate
Yuxuan Wan, Tianqing Fang et al.Jan 22arXiv

DeepVerifier is a plug-in checker that helps Deep Research Agents catch and fix their own mistakes while they are working, without retraining.

#Deep Research Agents#verification asymmetry#rubrics-based feedback

Inference-time Physics Alignment of Video Generative Models with Latent World Models

Intermediate
Jianhao Yuan, Xiaofeng Zhang et al.Jan 15arXiv

This paper teaches video-making AIs to follow real-world physics better without retraining them.

#video generation#physics plausibility#latent world model

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Intermediate
Jiajie Zhang, Xin Lv et al.Jan 9arXiv

The paper fixes a big problem in training web-searching AI: rewarding only the final answer makes agents cut corners and sometimes hallucinate.

#deep search agents#reinforcement learning#rubric rewards

DiffProxy: Multi-View Human Mesh Recovery via Diffusion-Generated Dense Proxies

Intermediate
Renke Wang, Zhenyu Zhang et al.Jan 5arXiv

DiffProxy turns tricky multi-camera photos of a person into a clean 3D body and hands by first painting a precise 'map' on each pixel and then fitting a standard body model to that map.

#human mesh recovery#SMPL-X#dense correspondence

SWE-RM: Execution-free Feedback For Software Engineering Agents

Intermediate
KaShun Shum, Binyuan Hui et al.Dec 26arXiv

Coding agents used to fix software rely on feedback; unit tests give only pass/fail signals that are often noisy or missing.

#execution-free feedback#reward model#software engineering agents
12