Papers2

#Fast Weights

Test-Time Training with KV Binding Is Secretly Linear Attention

Junchen Liu, Sven Elflein et al.Feb 24arXiv

The paper shows that Test-Time Training (TTT) with key–value (KV) binding is not really memorizing like a notebook; it is acting like a learned linear attention layer.

#Test-Time Training#KV Binding#Linear Attention

Not triaged yet

Nested Learning: The Illusion of Deep Learning Architectures

Intermediate

Ali Behrouz, Meisam Razaviyayn et al.Dec 31arXiv

The paper introduces Nested Learning, a new way to build AI that learns in layers (like Russian dolls), so each part can update at its own speed and remember different things.

#Nested Learning#Associative Memory#In-Context Learning

Not triaged yet