Qwen3-Coder-Next is an open-weight coding model that uses only 3B of its 80B total parameters at a time, so it runs fast while still being smart.
CLI-Gym is a new way to create lots of realistic computer-fixing tasks for AI by safely breaking and then repairing software environments inside containers.
Terminal-Bench 2.0 is a tough test that checks how well AI agents can solve real, professional tasks by typing commands in a computer terminal.