This paper fixes a common problem in reasoning AIs called Lazy Reasoning, where the model rambles instead of making a good plan.
The paper shows a new way to teach AI assistants how to use tools in many-step conversations by mining ordinary text on the internet for step-by-step βhow-toβ knowledge.