๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#learnable query tokens

Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance

Intermediate
Yiqi Lin, Guoqiang Liang et al.Mar 2arXiv

Kiwi-Edit is a new video editor that follows your words and also copies looks from a picture you give it.

#reference-guided video editing#instruction-based editing#multimodal large language model

VINO: A Unified Visual Generator with Interleaved OmniModal Context

Beginner
Junyi Chen, Tong He et al.Jan 5arXiv

VINO is a single AI model that can make and edit both images and videos by listening to text and looking at reference pictures and clips at the same time.

#VINO#unified visual generator#multimodal diffusion transformer