SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
IntermediateJialong Chen, Xander Xu et al.Mar 4arXiv
SWE-CI is a new benchmark that tests how well AI coding agents can keep a codebase healthy over many changes, not just fix one bug.
#SWE-CI#continuous integration#code maintainability