Metric Anything is a new way to teach AI real, ruler-like distances (metric depth) from very mixed and noisy 3D data.
VideoMaMa is a model that turns simple black-and-white object masks into soft, precise cutouts (alpha mattes) for every frame of a video.
Preference tuning teaches language models to act the way people like, but those habits can fall apart when the topic or style changes (domain shift).
This paper builds a foundation model called DAP that estimates real-world (metric) depth from any 360ยฐ panorama, indoors or outdoors.