WorldStereo is a method that turns a single photo (or a panorama) into a short set of camera-guided videos and then reconstructs a consistent 3D scene from them.
This paper builds a "generated reality" system that lets AI-made videos react to your real head and hand movements in VR.
This paper introduces Nexus Adapters, tiny helper networks that let a diffusion model follow both a text prompt and a structure map (like edges or depth) at the same time.
Spatia is a video generator that keeps a live 3D map of the scene (a point cloud) as its memory while making videos.
Steer3D lets you change a 3D object just by typing what you want, like โadd a roof rack,โ and it does it in one quick pass.
LongVie 2 is a video world model that can generate controllable videos for 3โ5 minutes while keeping the look and motion steady over time.