GUI-Libra is a training recipe that helps computer-using AI agents both think carefully and click precisely on screens.
Real life directions are often vague, so the paper creates a task where a robot can ask questions while it searches for a very specific object in a big house.