Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation
IntermediateHai Zhang, Siqi Liang et al.Feb 5arXiv
Robots usually need very detailed, step-by-step directions, but real life often gives only short, simple goals like βfind the red bench.β
#Beyond-the-View Navigation#Sparse Video Generation#Vision-Language Navigation