Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
IntermediateDongwon Jo, Beomseok Kang et al.Feb 3arXiv
This paper speeds up how AI models read very long texts by carefully choosing which words (tokens) to focus on at each step.
#Token Sparse Attention#Dynamic Token Coverage#Representation Drift