LIVE is a new way to train video-making AIs so their mistakes don’t snowball over long videos.
The paper fixes a hidden mistake many fast video generators were making when turning a "see-everything" model into a "see-past-only" model.
LingBot-World is an open-source world model that turns video generation into an interactive, real-time simulator.
OmniTransfer is a single system that learns from a whole reference video, not just one image, so it can copy how things look (identity and style) and how they move (motion, camera, effects).
Putting the reading passage before the question and answer choices (CQO) makes language models much more accurate than putting it after (QOC), by about 15 percentage points on average.
VideoAR is a new way to make videos with AI that writes each frame like a story, one step at a time, while painting details from coarse to fine.
InfiniteVGGT is a streaming 3D vision system that can keep working forever on live video without running out of memory.
This paper shows how to get strong text embeddings from decoder-only language models without any training.
IC-Effect is a new way to add special effects to existing videos by following a text instruction while keeping everything else unchanged.
Autoregressive (AR) models normally write one token at a time, which is accurate but slow for long answers.
ReFusion is a new way for AI to write text faster by planning in chunks (called slots) and then filling each chunk carefully.