Papers5

#representation learning

Next Embedding Prediction Makes World Models Stronger

George Bredis, Nikita Balagansky et al.Mar 3arXiv

NE-Dreamer is a model-based reinforcement learning agent that skips rebuilding pixels and instead learns by predicting the next step’s hidden features.

#model-based reinforcement learning#world models#next-embedding prediction

Not triaged yet

InfoNCE Induces Gaussian Distribution

Intermediate

Roy Betser, Eyal Gofer et al.Feb 27arXiv

The paper shows that when we train with the popular InfoNCE contrastive loss, the learned features start to behave like they come from a Gaussian (bell-shaped) distribution.

#InfoNCE#contrastive learning#Gaussian embeddings

Not triaged yet

KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs

Intermediate

Yixuan Tang, Yi YangJan 3arXiv

This paper shows how to get strong text embeddings from decoder-only language models without any training.

#text embeddings#decoder-only LLMs#causal attention

Not triaged yet

Next-Embedding Prediction Makes Strong Vision Learners

Beginner

Sihan Xu, Ziqiao Ma et al.Dec 18arXiv

This paper introduces NEPA, a very simple way to teach vision models by having them predict the next patch’s embedding in an image sequence, just like language models predict the next word.

#self-supervised learning#vision transformer#autoregression

Not triaged yet

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Intermediate

Minglei Shi, Haolin Wang et al.Dec 12arXiv

This paper shows you can train a big text-to-image diffusion model directly on the features of a vision foundation model (like DINOv3) without using a VAE.

#text-to-image#diffusion transformer#flow matching

Not triaged yet