This paper teaches a computer to find buttons, text, and icons on screens so it can click and type in the right places, a skill called GUI grounding.
FOCUSUI makes computer-using AI faster and still accurate by looking only at the important parts of a screen.