DeepSeek-OCR 2 teaches a computer to βreadβ pictures of documents in a smarter order, more like how people read.
Co2S is a new way to train segmentation models with very few labels by letting two different students (CLIP and DINOv3) learn together and correct each other.
JavisGPT is a single AI that can both understand sounding videos (audio + video together) and also create new ones that stay in sync.