๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#exploration

$ฯ€$-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs

Intermediate
Siting Wang, Xiaofeng Wang et al.Mar 2arXiv

Robots that read images and instructions (VLAs) get stuck following a narrow, fragile path after normal training.

#vision-language-action#flow matching#stochastic differential equations

Not triaged yet

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Intermediate
Zeyuan Liu, Jeonghye Kim et al.Feb 26arXiv

This paper teaches a language-model agent to explore smarter by combining two ways of learning (on-policy and off-policy) with a simple, self-written memory.

#EMPO#memory-augmented agents#on-policy learning

Not triaged yet

Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation

Intermediate
Zhiqi Yu, Zhangquan Chen et al.Feb 5arXiv

The paper finds a hidden symmetry inside GRPOโ€™s advantage calculation that accidentally stops models from exploring new good answers and from paying the right attention to easy versus hard problems at the right times.

#GRPO#GRAE#A-GRAE

Not triaged yet