Papers3

#image-to-video

NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing

NOVA is a new video editor that lets you change a few key frames (sparse control) while it carefully keeps the original motion and background details (dense synthesis).

#video editing#pair-free training#sparse control

Not triaged yet

FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space

Intermediate

FSVideo Team, Qingyu Chen et al.Feb 2arXiv

FSVideo is a new image-to-video generator that runs about 42× faster than popular open-source models while keeping similar visual quality.

#FSVideo#image-to-video#video diffusion transformer

Not triaged yet

Apollo: Unified Multi-Task Audio-Video Joint Generation

Intermediate

Jun Wang, Chunyu Qiang et al.Jan 7arXiv

APOLLO is a single, unified model that can make video and audio together or separately, and it keeps them tightly in sync.

#audio-video generation#multimodal diffusion#single-tower transformer

Not triaged yet