Trinity is a family of open language models that are huge on the inside but only wake up a few 'experts' for each word, so they are fast and affordable to run.
MiMo-V2-Flash is a giant but efficient language model that uses a team-of-experts design to think well while staying fast.
Nemotron 3 Nano is a new open-source language model that mixes two brain styles (Mamba and Transformer) and adds a team of special experts (MoE) so it thinks better while running much faster.