HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models
IntermediateMinghui Lin, Pengxiang Ding et al.Dec 10arXiv
Robots often act like goldfish with short memories; HiF-VLA fixes this by letting them use motion to remember the past and predict the future.
#Vision-Language-Action#motion vectors#temporal reasoning