FeatureBench: Benchmarking Agentic Coding for Complex Feature Development
IntermediateQixing Zhou, Jiacheng Zhang et al.Feb 11arXiv
FeatureBench is a new benchmark that tests AI coding agents on building real software features, not just fixing small bugs.
#FeatureBench#agentic coding#execution-based evaluation