This paper introduces TAM-Eval, a new way to test how well AI models can create, fix, and update unit tests for real software projects.
The paper introduces DeepResearchEval, a fully automated way to build realistic deep research tasks and to grade long research reports from AI systems.
SAM Audio is a new AI that can pull out exactly the sound you want from a noisy mix using text, clicks on a video, and time ranges—together or separately.