MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling
Intermediate MiniCPM Team, Wenhao An et al.Feb 12arXiv
MiniCPM-SALA is a 9B-parameter language model that mixes two kinds of attention—sparse and linear—to read very long texts quickly and accurately.
#long-context modeling#sparse attention#linear attention