Papers2

#LongBench

LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding

Long texts make language models slow because they must keep and re-check a huge memory called the KV cache for every new word they write.

#long-context LLM#sparse attention#head specialization

Recursive Language Models

Beginner

Alex L. Zhang, Tim Kraska et al.Dec 31arXiv

Recursive Language Models (RLMs) let an AI read and work with prompts that are much longer than its normal memory by treating the prompt like a big external document it can open, search, and study with code.

#Recursive Language Models#RLM#Long-context reasoning