JavisGPT is a single AI that can both understand sounding videos (audio + video together) and also create new ones that stay in sync.
UltraShape 1.0 is a two-step 3D generator that first makes a simple overall shape and then zooms in to add tiny details.
SAM Audio is a new AI that can pull out exactly the sound you want from a noisy mix using text, clicks on a video, and time ranges—together or separately.
ReCo is a new way to edit videos just by telling the computer what to change with words, no extra masks needed.
FlashPortrait makes talking-portrait videos that keep a person’s identity steady for as long as you want—minutes or even hours.
Robots learn best from what they would actually see, which is a first-person (egocentric) view, but most AI models are trained on third-person videos and get confused.
Kling-Omni is a single, unified model that can understand text, images, and videos together and then make or edit high-quality videos from those mixed instructions.
Spatia is a video generator that keeps a live 3D map of the scene (a point cloud) as its memory while making videos.
This paper fixes a common problem in video-making AIs where tiny mistakes snowball over time and ruin long videos.
IC-Effect is a new way to add special effects to existing videos by following a text instruction while keeping everything else unchanged.
Seedance 1.5 pro is a single model that makes video and sound together at the same time, so lips, music, and actions match naturally.
KlingAvatar 2.0 is a system that makes long, sharp, lifelike talking-person videos that follow audio, images, and text instructions all at once.