This paper introduces MATTRL, a way for multiple AI agents to learn from their own conversations at test time using short, reusable text notes instead of retraining their weights.
This paper teaches AI models to reason better by first copying only good examples and later learning from mistakes too.