SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
IntermediateXiangyi Li, Wenbo Chen et al.Feb 13arXiv
SkillsBench is a big test playground that measures whether giving AI agents step-by-step 'Skills' actually helps them finish real tasks.
#Agent Skills#LLM agents#Benchmarking