GaMO is a new way to rebuild 3D scenes from just a few photos by expanding each photo’s edges (outpainting) instead of inventing whole new camera views.
The paper teaches small language models to predict open-ended future events by turning daily news into thousands of safe, graded practice questions.
Computers usually click like a woodpecker, but they struggle to drag smoothly like a human hand; this paper fixes that.
This paper presents BEDA, a simple way to make chatty AI act strategically by turning what it believes into gentle rules (probabilistic constraints) that guide what it can say.
The paper fixes a stability problem in Hyper-Connections (HC) by gently steering the network’s mixing matrix onto a safe shape (a manifold) where signals don’t blow up or vanish.
This paper builds an open, end-to-end ecosystem (ALE) that lets AI agents plan, act, and fix their own mistakes across many steps in real computer environments.
Dream2Flow lets a robot watch a short, AI-generated video of a task and then do that task in real life by following object motion in 3D.
FlowBlending is a simple way to speed up video diffusion models by smartly choosing when to use a big model and when a small one is enough.
The paper introduces Nested Learning, a new way to build AI that learns in layers (like Russian dolls), so each part can update at its own speed and remember different things.
Youtu-LLM is a small (1.96B) language model that was trained from scratch to think, plan, and act like an agent instead of just copying bigger models.
Language is lumpy: easy stretches and tricky jumps are mixed together, but old models spend the same effort on every word.
Youtu-Agent is a build-and-grow factory for AI agents that cuts manual setup and keeps agents improving over time.