πŸŽ“How I Study AIHISA
πŸ“–Read
πŸ“„PapersπŸ“°Blogs🎬Courses
πŸ’‘Learn
πŸ›€οΈPathsπŸ“šTopicsπŸ’‘Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#uncertainty estimation

No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs

Intermediate
Liyan Xu, Mo Yu et al.Feb 2arXiv

Large language models don’t map out a full step-by-step plan before they start thinking; they mostly plan just a little bit ahead.

#chain-of-thought#latent planning horizon#Tele-Lens

GARDO: Reinforcing Diffusion Models without Reward Hacking

Intermediate
Haoran He, Yuxiao Ye et al.Dec 30arXiv

GARDO is a new way to fine-tune text-to-image diffusion models with reinforcement learning without getting tricked by bad reward signals.

#GARDO#reward hacking#gated KL regularization