Papers7

#imitation learning

RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

RoboMME is a new, big test playground that checks whether robot brains can remember important things over time, not just what they see right now.

#robot memory#long-horizon manipulation#vision-language-action (VLA)

Not triaged yet

VLANeXt: Recipes for Building Strong VLA Models

Intermediate

Xiao-Ming Wu, Bin Fan et al.Feb 20arXiv

This paper studies Vision–Language–Action (VLA) robots under one fair setup to find which design choices truly matter.

#Vision-Language-Action#robot manipulation#flow matching

Not triaged yet

InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions

Intermediate

Sirui Xu, Samuel Schulter et al.Feb 5arXiv

InterPrior is a new brain for simulated humans and humanoid robots that can move, balance, and use objects by following simple goals instead of step-by-step instructions.

#human-object interaction#physics-based control#goal-conditioned policy

Not triaged yet

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Intermediate

Moo Jin Kim, Yihuai Gao et al.Jan 22arXiv

Cosmos Policy teaches robots to act by fine-tuning a powerful video model in just one training stage, without changing the model’s architecture.

#video diffusion#robot policy learning#visuomotor control

Not triaged yet

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Intermediate

Chi-Pin Huang, Yunze Man et al.Jan 14arXiv

Fast-ThinkAct teaches a robot to plan with a few tiny hidden "thought tokens" instead of long paragraphs, making it much faster while staying smart.

#Vision-Language-Action#latent reasoning#verbalizable planning

Not triaged yet

MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

Intermediate

Ruicheng Zhang, Mingyang Zhang et al.Dec 7arXiv

Robots need lots of realistic, long videos to learn, but collecting them is slow and expensive.

#hierarchical video generation#robotic manipulation#long-horizon planning

Not triaged yet

From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks

Intermediate

Changpeng Yang, Jinyang Wu et al.Dec 2arXiv

This paper teaches AI models to reason better by first copying only good examples and later learning from mistakes too.

#Curriculum Advantage Policy Optimization#advantage-based RL#imitation learning

Not triaged yet