CubeComposer is a new AI method that turns a normal forward-facing video into a full 360° VR video at true 4K quality without using super-resolution upscaling.
The paper fixes a common problem in video world models: scenes slowly change or “drift” when the camera moves and comes back.
This paper teaches a vision-language model to first find objects in real 3D space (not just 2D pictures) and then reason about where things are.
The paper turns one flat picture into a neat stack of see‑through layers, so you can edit one thing without messing up the rest.
Large language models usually line words up in fixed order slots, which can waste mental energy and make it harder to find the important parts of a long or noisy text.
LitePT is a new AI backbone for 3D point clouds that uses convolutions in early layers and attention in later layers to be both fast and accurate.
GRAPE is a new way to tell Transformers where each word is in a sentence by using neat math moves called group actions.