GLM-5 is a new open-weight AI model that moves from 'vibe coding' (prompting the model to write code) to 'agentic engineering' (letting the model plan, build, test, and fix software on its own).
Step 3.5 Flash is a huge but efficient AI that keeps 196 billion total parameters but only wakes up about 11 billion per token, so it thinks smart and fast.
The paper shows that growing the embedding part of a language model (especially with n-grams) can beat adding more MoE experts once you pass a certain sparsity 'sweet spot.'
MiMo-V2-Flash is a giant but efficient language model that uses a team-of-experts design to think well while staying fast.
Nemotron 3 is a new family of open AI models (Nano, Super, Ultra) built to think better while running faster and cheaper.