Papers2

#self-refinement

Large Multimodal Models as General In-Context Classifiers

Marco Garosi, Matteo Farina et al.Feb 26arXiv

People often pick CLIP-like models for image labeling, but this paper shows that large multimodal models (LMMs) can be just as good—or even better—when you give them a few examples in the prompt (in-context learning).

#in-context learning#multimodal models#open-world classification

Not triaged yet

Self-Refining Video Sampling

Intermediate

Sangwon Jang, Taekyung Ki et al.Jan 26arXiv

This paper shows how a video generator can improve its own videos during sampling, without extra training or outside checkers.

#video generation#flow matching#denoising autoencoder

Not triaged yet