LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth
IntermediateWeihao Zeng, Yuzhen Huang et al.Feb 8arXiv
LOCA-bench is a test that challenges AI agents to work correctly as their to-do list and background information grow very, very long.
#LOCA-bench#long-context agents#context rot