This paper teaches AI to look things up on the web and fix its own mistakes mid-thought instead of starting over from scratch.
Long tasks trip up most AIs because they lose track of goals and make small mistakes that snowball over many steps.
MAI-UI is a family of AI agents that can see, understand, and control phone and computer screens using plain language.
BEAVER is a new way to check, with guaranteed certainty, how likely a language model is to give answers that obey important rules.