The paper discovers that popular RLVR methods for training language and vision-language models secretly prefer certain answer lengths, which can hurt learning.
This paper shows that comics (multi-panel pictures with words) can help AI think through problems step by step, just like a student explains their work.