Papers160

#reinforcement learning

FARE: Fast-Slow Agentic Robotic Exploration

Robots used to explore by following simple rules or short-term rewards, which often made them waste time and backtrack a lot.

#autonomous exploration#fast-slow thinking#hierarchical planning

KAGE-Bench: Fast Known-Axis Visual Generalization Evaluation for Reinforcement Learning

Intermediate

Egor Cherepanov, Daniil Zelezetsky et al.Jan 20arXiv

KAGE-Bench is a fast, carefully controlled benchmark that tests how well reinforcement learning (RL) agents trained on pixels handle specific visual changes, like new backgrounds or lighting, without changing the actual game rules.

#reinforcement learning#visual generalization#KAGE-Env

InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning

Intermediate

Matthew Y. R. Yang, Hao Bai et al.Jan 20arXiv

The paper introduces Intervention Training (InT), a simple way for a language model to find and fix the first wrong step in its own reasoning using a short, targeted correction.

#Intervention Training#credit assignment#LLM reasoning

Toward Efficient Agents: Memory, Tool learning, and Planning

Intermediate

Xiaofang Yang, Lijun Li et al.Jan 20arXiv

This survey explains how to make AI agents not just smart, but also efficient with their time, memory, and tool use.

#agent efficiency#memory compression#tool learning

Behavior Knowledge Merge in Reinforced Agentic Models

Intermediate

Xiangchi Yuan, Dachuan Shi et al.Jan 20arXiv

The paper solves a big problem: when you merge several reinforcement-learned models, their special skills get watered down by simple averaging.

#reinforcement learning#model merging#task vectors

Think3D: Thinking with Space for Spatial Reasoning

Beginner

Zaibin Zhang, Yuhan Wu et al.Jan 19arXiv

Think3D lets AI models stop guessing from flat pictures and start exploring real 3D space, like walking around a room in a video game.

#Think3D#spatial reasoning#3D reconstruction

PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models

Intermediate

Qiyuan Zhang, Biao Gong et al.Jan 16arXiv

This paper teaches video-making AIs to follow real-world physics, so rolling balls roll right and collisions look believable.

#physics-aware video generation#rigid body motion#reinforcement learning

BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

Intermediate

Shiyu Liu, Yongjing Yin et al.Jan 16arXiv

RL-trained search agents often sound confident even when they don’t know, which can mislead people.

#agentic search#reinforcement learning#boundary awareness

Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey

Beginner

Caihua Li, Lianghong Guo et al.Jan 15arXiv

This paper is the first big map of how AI can fix real software problems, not just write short code snippets.

#SWE-bench#issue resolution#AI coding agents

Urban Socio-Semantic Segmentation with Vision-Language Reasoning

Intermediate

Yu Wang, Yi Wang et al.Jan 15arXiv

Cities are full of places defined by people, like schools and parks, which are hard to see clearly from space without extra clues.

#socio-semantic segmentation#vision-language model#reinforcement learning

STEP3-VL-10B Technical Report

Beginner

Ailin Huang, Chengyuan Yao et al.Jan 14arXiv

STEP3-VL-10B is a small (10 billion parameters) open multimodal model that sees images and reads text, yet scores like much larger models.

#multimodal foundation model#unified pre-training#perception encoder

SkinFlow: Efficient Information Transmission for Open Dermatological Diagnosis via Dynamic Visual Encoding and Staged RL

Intermediate

Lijun Liu, Linwei Chen et al.Jan 14arXiv

SkinFlow is a 7B-parameter vision–language model that diagnoses skin conditions by sending the most useful visual information to the language brain, instead of just getting bigger.

#dermatology AI#vision-language model#Dynamic Visual Encoding

5 6 7 8 9