WorldStereo is a method that turns a single photo (or a panorama) into a short set of camera-guided videos and then reconstructs a consistent 3D scene from them.
Diffusion models make great images and videos but are slow because they usually need many tiny steps.
The paper fixes a big problem in long video generation: models either forget what happened or slowly drift off-topic over time.
The paper solves a big problem in fast image generators: they got quick, but they lost variety and kept making similar pictures.
The paper fixes a hidden mistake many fast video generators were making when turning a "see-everything" model into a "see-past-only" model.
LingBot-World is an open-source world model that turns video generation into an interactive, real-time simulator.
Yume1.5 is a model that turns text or a single image into a living, explorable video world you can move through with keyboard keys.
TwinFlow is a new way to make big image models draw great pictures in just one step instead of 40–100 steps.