This paper builds a big, reusable library of computer skills so an AI can use Windows apps more like a careful human, not a clumsy robot.
Computer-using agents kept forgetting important visual details over long tasks and could not reliably find up-to-date, step-by-step help for unfamiliar apps.
MAI-UI is a family of AI agents that can see, understand, and control phone and computer screens using plain language.