Papers2

#YaRN

RePo: Language Models with Context Re-Positioning

Huayang Li, Tianyu Zhao et al.Dec 16arXiv

Large language models usually line words up in fixed order slots, which can waste mental energy and make it harder to find the important parts of a long or noisy text.

#context re-positioning#positional encoding#self-attention

Not triaged yet

Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs

Intermediate

Xiaoran Liu, Yuerong Song et al.Dec 8arXiv

Big language models use RoPE to remember word order, but it throws away the imaginary half of a complex number during attention.

#RoPE++#Rotary Position Embeddings#Imaginary Attention

Not triaged yet