MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
IntermediateRuicheng Zhang, Mingyang Zhang et al.Dec 7arXiv
Robots need lots of realistic, long videos to learn, but collecting them is slow and expensive.
#hierarchical video generation#robotic manipulation#long-horizon planning