This paper shows that code-writing AI agents can take an existing math problem and automatically turn it into a new, harder one while keeping it solvable.
Large language models sometimes reach the right answer for the wrong reasons, which is risky and confusing.
VERGE is a teamwork system where an AI writer (an LLM) works with a strict math checker (an SMT solver) to make answers both smart and logically sound.