Youtu-LLM is a small (1.96B) language model that was trained from scratch to think, plan, and act like an agent instead of just copying bigger models.
GenEnv is a training system where a student AI and a teacher simulator grow together by exchanging tasks and feedback.
This paper teaches vision-language models to reason about pictures using puzzles instead of expensive human labels.
This paper builds InternGeometry, a large language model agent that solves Olympiad-level geometry by talking to a math engine, remembering what worked, and trying smart new ideas.
This paper teaches AI models to reason better by first copying only good examples and later learning from mistakes too.