How I Study AI - Learn AI Papers & Lectures the Easy Way

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Intermediate

Shobhita Sundaram, John Quan et al.Jan 26arXiv

This paper teaches a model to be its own teacher so it can climb out of a learning plateau on very hard math problems.

#meta-reinforcement learning#teacher-student self-play#grounded rewards

Not triaged yet

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

Intermediate

Rujiao Long, Yang Li et al.Dec 19arXiv

Reasoning Palette gives a language or vision-language model a tiny hidden “mood” (a latent code) before it starts answering, so it chooses a smarter plan rather than just rolling dice on each next word.

#Reasoning Palette#latent contextualization#VAE

Not triaged yet

From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks

Intermediate

Changpeng Yang, Jinyang Wu et al.Dec 2arXiv

This paper teaches AI models to reason better by first copying only good examples and later learning from mistakes too.

#Curriculum Advantage Policy Optimization#advantage-based RL#imitation learning

Not triaged yet

Papers3

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks