The paper fixes a stability problem in Hyper-Connections (HC) by gently steering the network’s mixing matrix onto a safe shape (a manifold) where signals don’t blow up or vanish.
This paper builds an open, end-to-end ecosystem (ALE) that lets AI agents plan, act, and fix their own mistakes across many steps in real computer environments.
Dream2Flow lets a robot watch a short, AI-generated video of a task and then do that task in real life by following object motion in 3D.
FlowBlending is a simple way to speed up video diffusion models by smartly choosing when to use a big model and when a small one is enough.
The paper introduces Nested Learning, a new way to build AI that learns in layers (like Russian dolls), so each part can update at its own speed and remember different things.
Youtu-LLM is a small (1.96B) language model that was trained from scratch to think, plan, and act like an agent instead of just copying bigger models.
Language is lumpy: easy stretches and tricky jumps are mixed together, but old models spend the same effort on every word.
Youtu-Agent is a build-and-grow factory for AI agents that cuts manual setup and keeps agents improving over time.
This paper teaches text-to-video models to follow real-world physics, so people, balls, water, glass, and fire act the way they should.
SenseNova-MARS is a vision-language model that can think step-by-step and use three tools—text search, image search, and image cropping—during its reasoning.
FIGR is a new way for AI to ‘think by drawing,’ using code to build clean, editable diagrams while it reasons.
Multimodal Large Language Models (MLLMs) often hallucinate on videos by trusting words and common sense more than what the frames really show.