🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers7

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#multi-view consistency

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Intermediate
Pengze Zhang, Yanze Wu et al.Jan 20arXiv

OmniTransfer is a single system that learns from a whole reference video, not just one image, so it can copy how things look (identity and style) and how they move (motion, camera, effects).

#spatio-temporal video transfer#identity transfer#style transfer

3AM: 3egment Anything with Geometric Consistency in Videos

Intermediate
Yang-Che Sun, Cheng Sun et al.Jan 13arXiv

3AM is a new way to track and segment the same object across a whole video, even when the camera view changes a lot.

#video object segmentation#SAM2#geometry-aware tracking

Plenoptic Video Generation

Intermediate
Xiao Fu, Shitao Tang et al.Jan 8arXiv

PlenopticDreamer is a new way to remake a video from different camera paths while keeping everything consistent across views and over time.

#plenoptic function#camera-controlled video generation#video re-rendering

How Much 3D Do Video Foundation Models Encode?

Intermediate
Zixuan Huang, Xiang Li et al.Dec 23arXiv

This paper asks a simple question: do video AI models trained only on 2D videos secretly learn about 3D worlds?

#video foundation models#3D awareness#temporal reasoning

MatSpray: Fusing 2D Material World Knowledge on 3D Geometry

Intermediate
Philipp Langsteiner, Jan-Niklas Dihlmann et al.Dec 20arXiv

MatSpray turns 2D guesses about what materials look like (color, shininess, metal) into a clean 3D model you can relight realistically.

#MatSpray#3D Gaussian Splatting#Gaussian Ray Tracing

InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion

Intermediate
Hoiyeong Jin, Hyojin Jang et al.Dec 19arXiv

InsertAnywhere is a two-stage system that lets you add a new object into any video so it looks like it was always there.

#video object insertion#4D scene geometry#diffusion video generation

Towards Interactive Intelligence for Digital Humans

Intermediate
Yiyi Cai, Xuangeng Chu et al.Dec 15arXiv

Digital humans used to just copy motions; this paper makes them think, speak, and move in sync like real people.

#interactive intelligence#digital human#multimodal avatar