The paper shows that judging vector search only by distance-based recall and speed can be very misleading for real tasks.
Scone is a new AI method that makes images from instructions while correctly picking the right subject even when many look similar.
The FACTS Leaderboard is a four-part test that checks how truthful AI models are across images, memory, web search, and document grounding.
SHARP turns a single photo into a 3D scene you can look around in, and it does this in under one second on a single GPU.
This paper introduces the Confucius Code Agent (CCA), a coding helper built to handle huge real-world codebases with long tasks and many tools.
This paper creates MotionEdit, a high-quality dataset that teaches AI to change how people and objects move in a picture without breaking their looks or the scene.
VABench is a new, all-in-one test that checks how well AI makes videos with matching sound and pictures.
Visionary is a web-based platform that lets you view and interact with advanced 3D scenes, right in your browser, with just a click.
Multi-agent AI teams are not automatically better; their success depends on matching the team’s coordination style to the job’s structure.
UnityVideo is a single, unified model that learns from many kinds of video information at once—like colors (RGB), depth, motion (optical flow), body pose, skeletons, and segmentation—to make smarter, more realistic videos.
OneStory is a new way to make long videos from many shots that stay consistent with the story, characters, and places across time.
The paper asks when reinforcement learning (RL) really makes language models better at reasoning beyond what they learned in pre-training.