SWE-Universe is a factory-like system that turns real GitHub pull requests into safe, repeatable coding practice worlds with automatic checkers.
SWE-EVO is a new test (benchmark) that checks if AI coding agents can upgrade real software projects over many steps, not just fix one small bug.