Diffusion Preview is a two-step “preview-then-refine” workflow that shows you a fast draft image first and only spends full compute after you like the draft.
The paper shows that judging vector search only by distance-based recall and speed can be very misleading for real tasks.
Scone is a new AI method that makes images from instructions while correctly picking the right subject even when many look similar.
The FACTS Leaderboard is a four-part test that checks how truthful AI models are across images, memory, web search, and document grounding.
SHARP turns a single photo into a 3D scene you can look around in, and it does this in under one second on a single GPU.
This paper introduces the Confucius Code Agent (CCA), a coding helper built to handle huge real-world codebases with long tasks and many tools.
This paper creates MotionEdit, a high-quality dataset that teaches AI to change how people and objects move in a picture without breaking their looks or the scene.
VABench is a new, all-in-one test that checks how well AI makes videos with matching sound and pictures.
Visionary is a web-based platform that lets you view and interact with advanced 3D scenes, right in your browser, with just a click.
Multi-agent AI teams are not automatically better; their success depends on matching the team’s coordination style to the job’s structure.
UnityVideo is a single, unified model that learns from many kinds of video information at once—like colors (RGB), depth, motion (optical flow), body pose, skeletons, and segmentation—to make smarter, more realistic videos.
OneStory is a new way to make long videos from many shots that stay consistent with the story, characters, and places across time.