How I Study AI - Learn AI Papers & Lectures the Easy Way

WorldCompass: Reinforcement Learning for Long-Horizon World Models

Zehan Wang, Tengfei Wang et al.Feb 9arXiv

WorldCompass teaches video world models to follow actions better and keep pictures pretty by using reinforcement learning after pretraining.

#world models#reinforcement learning#clip-level rollout

Not triaged yet

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Intermediate

Zelai Xu, Zhexuan Xu et al.Feb 4arXiv

WideSeek-R1 teaches a small 4B-parameter language model to act like a well-run team: one leader plans, many helpers work in parallel, and everyone learns together with reinforcement learning.

#width scaling#multi-agent reinforcement learning#orchestration

Not triaged yet

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Intermediate

Zhe Huang, Hao Wen et al.Dec 30arXiv

Multimodal Large Language Models (MLLMs) often hallucinate on videos by trusting words and common sense more than what the frames really show.

#multimodal large language model#video understanding#visual hallucination

Not triaged yet

Papers3

WorldCompass: Reinforcement Learning for Long-Horizon World Models

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation