Papers2

#compute efficiency

Proxy Compression for Language Modeling

Most language models are trained on compressed tokens, which makes training fast but ties the model to a specific tokenizer.

#proxy compression#byte-level language modeling#tokenizer-free inference

An Information Theoretic Perspective on Agentic System Design

Intermediate

Shizhe He, Avanika Narayan et al.Dec 25arXiv

The paper shows that many AI systems work best when a small 'compressor' model first shrinks long text into a short, info-packed summary and a bigger 'predictor' model then reasons over that summary.

#agentic systems#compressor-predictor#mutual information