Robots often get confused by wordy instructions, so this paper tells them exactly where to touch instead of what to do in sentences.
Video models can now be told what physical result you want (like “make this ball move left with a strong push”) using Goal Force, instead of just vague text or a final picture.
SHARP turns a single photo into a 3D scene you can look around in, and it does this in under one second on a single GPU.
UnityVideo is a single, unified model that learns from many kinds of video information at once—like colors (RGB), depth, motion (optical flow), body pose, skeletons, and segmentation—to make smarter, more realistic videos.