How I Study AI - Learn AI Papers & Lectures the Easy Way

Intermediate

Boqiang Zhang, Lei Ke et al.Mar 6arXiv

Penguin-VL shows that small vision-language models (2B and 8B) can be very strong if you give them a better vision encoder, not just a bigger brain.

#Vision Language Model#LLM-based Vision Encoder#Contrastive Learning

Papers1