Nemotron 3 is a new family of open AI models (Nano, Super, Ultra) built to think better while running faster and cheaper.
Nemotron 3 Nano is a new open-source language model that mixes two brain styles (Mamba and Transformer) and adds a team of special experts (MoE) so it thinks better while running much faster.
INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (about 12B active per token) trained with large-scale reinforcement learning and it beats many bigger models on math, coding, science, and reasoning tests.
Large language models get smarter when they get bigger, but storing all those extra weights eats tons of memory.
Recursive transformers save memory by reusing the same layer over and over, but that makes them less expressive and hurts accuracy.
Scone is a new AI method that makes images from instructions while correctly picking the right subject even when many look similar.