This paper teaches AI to pay attention better by training its focus, not just its words.
The paper studies how to teach a smaller language model using a bigger one by only focusing on the most useful bits instead of everything.