SERA is a new, low-cost way to train coding helpers (agents) that learn the style and secrets of your own codebase.
DiffProxy turns tricky multi-camera photos of a person into a clean 3D body and hands by first painting a precise 'map' on each pixel and then fitting a standard body model to that map.
Transparent and shiny objects confuse normal depth cameras, but video diffusion models already learned how light bends and reflects through them.
The paper introduces UCoder, a way to teach a code-generating AI to get better without using any outside datasets, not even unlabeled code.
The paper shows how a vision-language model (VLM) can train itself to be a fair judge of answers about images without using any human preference labels.