SLATE is a new way to teach AI to think step by step while using a search engine, giving feedback at each step instead of only at the end.
This paper shows that when AI models grade university-level math proofs, they often disagree with human experts in systematic ways.
Deep search agents can plan and browse the web in many steps, but they often fail because they donβt notice when their own thinking drifts off-track.