GUI-Libra is a training recipe that helps computer-using AI agents both think carefully and click precisely on screens.
The paper builds a Computer-Using World Model (CUWM) that lets an AI “imagine” what a desktop app (like Word/Excel/PowerPoint) will look like after a click or keystroke—before doing it for real.
This paper builds GUI-Owl-1.5, an AI that can use phones, computers, and web browsers like a careful human helper.
OmegaUse is a new AI that can use phones and computers by looking at screenshots and deciding where to click, type, or scroll—much like a careful human user.
MAI-UI is a family of AI agents that can see, understand, and control phone and computer screens using plain language.