HERBench is a new test that checks if video AI models can combine several clues spread across time, not just guess from one frame or language priors.
This paper argues that the fastest and safest path to super-smart AI is for humans and AIs to improve together, not for AI to improve alone.