πŸŽ“How I Study AIHISA
πŸ“–Read
πŸ“„PapersπŸ“°Blogs🎬Courses
πŸ’‘Learn
πŸ›€οΈPathsπŸ“šTopicsπŸ’‘Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Image Reconstruction

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Intermediate
Letian Zhang, Sucheng Ren et al.Jan 21arXiv

OpenVision 3 is a single vision encoder that learns one set of image tokens that work well for both understanding images (like answering questions) and generating images (like making new pictures).

#Unified Visual Encoder#VAE#Vision Transformer

Implicit Neural Representation Facilitates Unified Universal Vision Encoding

Intermediate
Matthew Gwilliam, Xiao Wang et al.Jan 20arXiv

This paper introduces HUVR, a single vision model that can both recognize what’s in an image and reconstruct or generate images from tiny codes.

#Implicit Neural Representation#Hyper-Networks#Vision Transformer

VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction

Intermediate
Sinan Du, Jiahao Guo et al.Nov 28arXiv

VQRAE is a new kind of image tokenizer that lets one model both understand images (continuous features) and generate/reconstruct them (discrete tokens).

#VQRAE#Vector Quantization#Representation Autoencoder