Golden Goose turns messy internet text into clean multiple-choice puzzles that computers can learn from and get automatic rewards for.
The paper teaches AI agents better by grading not just their final answers, but also how they think and use tools along the way.
LightOnOCR-2-1B is a single, compact AI model that reads PDF pages and scans and turns them into clean, well-ordered text without using fragile multi-step OCR pipelines.