SWE-World lets code-fixing AI agents practice and learn without heavy Docker containers by using smart models that pretend to be the computer and tests.
SWE-Universe is a factory-like system that turns real GitHub pull requests into safe, repeatable coding practice worlds with automatic checkers.
The paper introduces RPG-Encoder, a way to turn a whole code repository into one clear map that mixes meaning (semantics) with structure (dependencies).
This paper builds a smart team of AI helpers, called MEnvAgent, that automatically sets up the right computer environments for code projects in many languages.
This paper teaches code AIs to work more like real software engineers by training them in the middle of their learning using real development workflows.
Terminal-Bench 2.0 is a tough test that checks how well AI agents can solve real, professional tasks by typing commands in a computer terminal.
MegaFlow is a new system that helps thousands of AI agents practice and test big, messy tasks (like fixing real software bugs) all at once without crashing or wasting money.
Coding agents used to fix software rely on feedback; unit tests give only pass/fail signals that are often noisy or missing.