This paper teaches language models not just to get the final answer right but to think in a way others can reliably follow.
This paper teaches AI agents to learn new reusable skills and get better over time by using reinforcement learning, not just prompts.