VLingNav is a robot navigation system that sees, reads instructions, and acts, while deciding when to think hard and when to just move.
Computers usually click like a woodpecker, but they struggle to drag smoothly like a human hand; this paper fixes that.