CFE-BENCH is a new, teacher-verified "Classroom Final Exam" for AI that uses real college STEM problems to test deep, step-by-step reasoning.
TSRBench is a giant test that checks if AI models can understand and reason about data that changes over time, like heartbeats, stock prices, and weather.