JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation
IntermediateKai Liu, Yanhao Zheng et al.Feb 22arXiv
JavisDiT++ is a new AI that makes short videos and matching sounds from a text prompt, keeping sight and sound in sync.
#joint audio-video generation#multimodal diffusion transformer#modality-specific mixture-of-experts