ArtHOI is a new zero-shot method that makes people and everyday articulated objects (like doors, drawers, and fridges) move together realistically using only a single generated video as guidance.
The paper shows that when a model compares two of its own answers head-to-head, it picks the right one more often than when it judges each answer alone.
CubeComposer is a new AI method that turns a normal forward-facing video into a full 360° VR video at true 4K quality without using super-resolution upscaling.
This paper teaches long-horizon AI agents to remember everything exactly without stuffing their whole memory at once.
RIVER Bench is a new test that checks how well AI can watch a video stream and talk with you in real time.
Phi-4-reasoning-vision-15B is a small, open-weight AI that understands pictures and text together and is especially good at math, science, and using computer screens.
SWE-CI is a new benchmark that tests how well AI coding agents can keep a codebase healthy over many changes, not just fix one bug.
This paper shows that teaching AI to first draw a simple map of a text (nodes and links) before answering questions makes it smarter and more reliable.
Scientists want AI to propose brand‑new hypotheses directly from a research background, but training a model to do this end‑to‑end is mathematically intractable because the search space explodes combinatorially.
InfinityStory is a new system that can make very long videos (even hours) where the world stays the same and characters transition smoothly between shots.
Proact-VL is a video-talking AI that knows not only what to say but also when to say it, like a great sports commentator.
Utonia is a single brain (encoder) that learns from many kinds of 3D point clouds, like indoor rooms, outdoor streets, tiny toys, and even city maps.