Reasoning Core is a tool that automatically creates a huge variety of logic and math puzzles, checks every answer with real solvers, and lets you smoothly dial the difficulty up or down.
Most reinforcement learning agents only get a simple pass/fail reward, which hides how good or bad their attempts really were.