🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers138

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#GRPO

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

Intermediate
Salman Rahman, Sruthi Gorantla et al.Dec 2arXiv

SPARK teaches AI to grade its own steps without needing the right answers written down anywhere.

#SPARK#Process Reward Model#PRM-CoT

ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning

Intermediate
Yifan Li, Yingda Yin et al.Dec 2arXiv

ReVSeg teaches an AI to segment objects in videos by thinking step-by-step instead of guessing everything at once.

#Reasoning Video Object Segmentation#Vision-Language Models#Temporal Grounding

PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

Intermediate
Bowen Ping, Chengyou Jia et al.Dec 2arXiv

This paper teaches image models to keep things consistent across multiple pictures—like the same character, art style, and story logic—using reinforcement learning (RL).

#consistent image generation#pairwise reward modeling#reinforcement learning

From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks

Intermediate
Changpeng Yang, Jinyang Wu et al.Dec 2arXiv

This paper teaches AI models to reason better by first copying only good examples and later learning from mistakes too.

#Curriculum Advantage Policy Optimization#advantage-based RL#imitation learning

RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards

Intermediate
Junyan Ye, Leiqi Zhu et al.Nov 29arXiv

RealGen is a new way to make computer-made pictures look so real that they can fool expert detectors and even careful judges.

#photorealistic text-to-image#detector-guided rewards#reinforcement learning

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Intermediate
Long Lian, Sida Wang et al.Nov 24arXiv

ThreadWeaver teaches a language model to split big problems into smaller parts it can solve at the same time, like teammates working in parallel.

#adaptive parallel reasoning#fork–join#threaded inference
89101112