This paper introduces PCED, a way to use many documents as separate 'experts' in parallel so an AI can stitch answers together without stuffing everything into one giant prompt.
Big reasoning AIs think in many steps, which is slow and costly.
This paper shows how to get strong text embeddings from decoder-only language models without any training.
LiveTalk turns slow, many-step video diffusion into a fast, 4-step, real-time system for talking avatars that listen, think, and respond with synchronized video.
SpotEdit is a training‑free way to edit only the parts of an image that actually change, instead of re-generating the whole picture.
HiStream makes 1080p video generation much faster by removing repeated work across space, time, and steps.
Kling-Omni is a single, unified model that can understand text, images, and videos together and then make or edit high-quality videos from those mixed instructions.
This paper speeds up diffusion language models (dLLMs) by changing the order in which they fill in missing words.
MemFlow is a new way for AI to remember the right parts of a long video story while it keeps making new parts, so characters and scenes stay consistent.
VideoSSM is a new way to make long, stable, and lively videos by giving the model two kinds of memory: a short-term window and a long-term state-space memory.