Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation
IntermediateMeng Wei, Chenyang Wan et al.Dec 9arXiv
Robots that follow spoken instructions used to be slow and jerky because one big model tried to think and move at the same time.
#vision-and-language navigation#VLM planner#dual-system architecture