CubeComposer is a new AI method that turns a normal forward-facing video into a full 360° VR video at true 4K quality without using super-resolution upscaling.
SkyReels-V4 is a single, unified model that makes videos and matching sounds together, while also letting you fix or change parts of a video.
Video generators are slow because attention looks at everything, which takes a lot of time.
MiniCPM-SALA is a 9B-parameter language model that mixes two kinds of attention—sparse and linear—to read very long texts quickly and accurately.
The paper fixes a common problem in video world models: scenes slowly change or “drift” when the camera moves and comes back.
Long texts make language models slow because they must keep and re-check a huge memory called the KV cache for every new word they write.
The paper makes long video generation much faster and lighter on memory by cutting out repeated work in attention.
Transformers slow down on very long inputs because standard attention looks at every token pair, which is expensive.
Videos are made of very long lists of tokens, and regular attention looks at every pair of tokens, which is slow and expensive.
This paper fixes a common problem in video-making AIs where tiny mistakes snowball over time and ruin long videos.