This paper builds a smart team of AI helpers, called MEnvAgent, that automatically sets up the right computer environments for code projects in many languages.
The paper tackles understanding super long, first‑person videos (days to a week) by giving an AI a smarter memory and better tools.
NL2Repo-Bench is a new benchmark that tests if coding agents can build a whole Python library from just one long natural-language document and an empty folder.