TriPlay-RL is a three-role self-play training loop (attacker, defender, evaluator) that teaches AI models to be safer with almost no manual labels.
This paper teaches a vision-language model to think about images by talking to copies of itself, using only words to plan and decide.