UltraShape 1.0 is a two-step 3D generator that first makes a simple overall shape and then zooms in to add tiny details.
This paper speeds up how 3D scenes handle big, 512‑dimensional features without throwing away important information.
Nemotron 3 is a new family of open AI models (Nano, Super, Ultra) built to think better while running faster and cheaper.
Nemotron 3 Nano is a new open-source language model that mixes two brain styles (Mamba and Transformer) and adds a team of special experts (MoE) so it thinks better while running much faster.
SemanticGen is a new way to make videos that starts by planning in a small, high-level 'idea space' (semantic space) and then adds the tiny visual details later.
LongVideoAgent is a team of three AIs that work together to answer questions about hour‑long TV episodes without missing small details.
SpatialTree is a new, four-level "ability tree" that tests how multimodal AI models (that see and read) handle space: from basic seeing to acting in the world.
The paper turns video avatars from passive puppets into active doers that can plan, act, check their own work, and fix mistakes over many steps.
The paper shows that big sequence models (like transformers) quietly learn longer goals inside their hidden activations, even though they are trained one step at a time.
Large language models often sound confident even when they are wrong, and existing ways to catch mistakes are slow or not very accurate.
The paper tackles a big blind spot in vision-language models: understanding how objects move and relate in 3D over time (dynamic spatial reasoning, or DSR).
Search is not the same as research; real research needs planning, checking many sources, fixing mistakes, and writing a clear report.