Recurrent neural networks (RNNs) are fast but forgetful because they squeeze everything they’ve seen into a tiny, fixed memory.
The paper shows that Test-Time Training (TTT) with key–value (KV) binding is not really memorizing like a notebook; it is acting like a learned linear attention layer.
The paper proposes HyTRec, a recommender system that reads very long histories fast while still paying sharp attention to the latest clicks and purchases.
SLA2 is a new way for AI to pay attention faster by smartly splitting work between two helpers: a precise one (sparse attention) and a speedy one (linear attention).
Transformers are powerful but slow because regular self-attention compares every token with every other token, which grows too fast for long sequences.