🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Needle-in-a-Haystack

Memory Caching: RNNs with Growing Memory

Beginner
Ali Behrouz, Zeman Li et al.Feb 27arXiv

Recurrent neural networks (RNNs) are fast but forgetful because they squeeze everything they’ve seen into a tiny, fixed memory.

#Memory Caching#Recurrent Neural Networks#Attention

Not triaged yet

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Intermediate
Yingfa Chen, Zhen Leng Thai et al.Jan 29arXiv

This paper shows how to turn a big Transformer model into a faster hybrid model that mixes attention and RNN layers using far less training data (about 2.3B tokens).

#hybrid attention#RNN attention hybrid#linear attention

Not triaged yet

Fast-weight Product Key Memory

Intermediate
Tianyu Zhao, Llion JonesJan 2arXiv

The paper introduces Fast-weight Product Key Memory (FwPKM), a memory layer that can quickly learn from the current text it reads, not just from past training.

#Fast-weight memory#Product Key Memory#Sparse retrieval

Not triaged yet

End-to-End Test-Time Training for Long Context

Intermediate
Arnuv Tandon, Karan Dalal et al.Dec 29arXiv

This paper shows how a language model can keep learning while you use it, so it handles very long inputs without slowing down.

#Test-Time Training#Meta-learning#Long-context language modeling

Not triaged yet