๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐ŸงฉProblems๐ŸŽฏPrompts๐Ÿง Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Rotary Position Embeddings

Improving Recursive Transformers with Mixture of LoRAs

Intermediate
Mohammadmahdi Nouriborji, Morteza Rohanian et al.Dec 14arXiv

Recursive transformers save memory by reusing the same layer over and over, but that makes them less expressive and hurts accuracy.

#Mixture of LoRAs#recursive transformers#parameter sharing

Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs

Intermediate
Xiaoran Liu, Yuerong Song et al.Dec 8arXiv

Big language models use RoPE to remember word order, but it throws away the imaginary half of a complex number during attention.

#RoPE++#Rotary Position Embeddings#Imaginary Attention