Papers7

#multi-view consistency

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Pengze Zhang, Yanze Wu et al.Jan 20arXiv

OmniTransfer is a single system that learns from a whole reference video, not just one image, so it can copy how things look (identity and style) and how they move (motion, camera, effects).

#spatio-temporal video transfer#identity transfer#style transfer

Not triaged yet

3AM: 3egment Anything with Geometric Consistency in Videos

Intermediate

Yang-Che Sun, Cheng Sun et al.Jan 13arXiv

3AM is a new way to track and segment the same object across a whole video, even when the camera view changes a lot.

#video object segmentation#SAM2#geometry-aware tracking

Not triaged yet

Plenoptic Video Generation

Intermediate

Xiao Fu, Shitao Tang et al.Jan 8arXiv

PlenopticDreamer is a new way to remake a video from different camera paths while keeping everything consistent across views and over time.

#plenoptic function#camera-controlled video generation#video re-rendering

Not triaged yet

How Much 3D Do Video Foundation Models Encode?

Intermediate

Zixuan Huang, Xiang Li et al.Dec 23arXiv

This paper asks a simple question: do video AI models trained only on 2D videos secretly learn about 3D worlds?

#video foundation models#3D awareness#temporal reasoning

Not triaged yet

MatSpray: Fusing 2D Material World Knowledge on 3D Geometry

Intermediate

Philipp Langsteiner, Jan-Niklas Dihlmann et al.Dec 20arXiv

MatSpray turns 2D guesses about what materials look like (color, shininess, metal) into a clean 3D model you can relight realistically.

#MatSpray#3D Gaussian Splatting#Gaussian Ray Tracing

Not triaged yet

InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion

Intermediate

Hoiyeong Jin, Hyojin Jang et al.Dec 19arXiv

InsertAnywhere is a two-stage system that lets you add a new object into any video so it looks like it was always there.

#video object insertion#4D scene geometry#diffusion video generation

Not triaged yet

Towards Interactive Intelligence for Digital Humans

Intermediate

Yiyi Cai, Xuangeng Chu et al.Dec 15arXiv

Digital humans used to just copy motions; this paper makes them think, speak, and move in sync like real people.

#interactive intelligence#digital human#multimodal avatar

Not triaged yet