WoG (World Guidance) teaches a robot to imagine just the right bits of the near future and use those bits to pick better actions.
Vision-Language-Action (VLA) models are robots’ “see–think–do” brains that connect cameras (vision), words (language), and motors (action).