HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing
IntermediateYizhao Gao, Jianyu Wei et al.Feb 3arXiv
HySparse is a new way for AI models to pay attention that mixes a few full attention layers with many fast, memory‑saving sparse layers.
#Hybrid Sparse Attention#Oracle Token Selection#KV Cache Sharing