FeatureBench is a new benchmark that tests AI coding agents on building real software features, not just fixing small bugs.
The paper introduces RPG-Encoder, a way to turn a whole code repository into one clear map that mixes meaning (semantics) with structure (dependencies).