How I Study AI - Learn AI Papers & Lectures the Easy Way

TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics

Intermediate

Shirui Chen, Cole Harrison et al.Feb 22arXiv

Robots learn better when they get small hints at every step instead of only a final thumbs-up or thumbs-down.

#TOPReward#token probabilities#logits

Not triaged yet

Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Intermediate

Yuguang Yue, Irakli Salia et al.Jan 8arXiv

The paper teaches a game-playing AI to copy good human players (behavior cloning) and shows that simply scaling up the model and the data makes the AI reason more causally (it pays attention to what truly causes outcomes on screen).

#behavior cloning#causal reasoning#causal confusion

Not triaged yet

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Intermediate

Loïc Magne, Anas Awadalla et al.Jan 4arXiv

NitroGen is a vision-to-action AI that learns to play many video games by watching 40,000 hours of gameplay videos from over 1,000 titles with on-screen controller overlays.

#NitroGen#generalist gaming agent#behavior cloning

Not triaged yet

Papers3

TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics

Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

NitroGen: An Open Foundation Model for Generalist Gaming Agents