Timer-S1 is a huge time-series model (8.3B parameters, only 0.75B used per step) that predicts the future by thinking step-by-step inside one forward pass.
Speculative decoding speeds up big language models by letting a small helper model guess several next words and having the big model check them all at once.
Trinity is a family of open language models that are huge on the inside but only wake up a few 'experts' for each word, so they are fast and affordable to run.
MOVA is an open-source AI that makes videos and sounds at the same time so mouths, actions, and noises match perfectly.
SWE-Universe is a factory-like system that turns real GitHub pull requests into safe, repeatable coding practice worlds with automatic checkers.
OmegaUse is a new AI that can use phones and computers by looking at screenshots and deciding where to click, type, or scroll—much like a careful human user.
Mixture-of-Experts (MoE) models often send far more tokens to a few “favorite” experts, which overloads some GPUs while others sit idle.
TAG-MoE is a new way to steer Mixture-of-Experts (MoE) models using clear task hints, so the right “mini-experts” handle the right parts of an image job.
Solar Open is a giant bilingual AI (102 billion parameters) that focuses on helping underserved languages like Korean catch up with English-level AI quality.
Big all-in-one language models are powerful but too expensive to run everywhere, while small specialists are cheaper but narrow.
Mixture-of-Experts (MoE) models use many small specialist networks and only activate a few per token, but classic LoRA fine-tuning gives every expert the same rank, wasting parameters on the wrong experts.
Mixture-of-Experts (MoE) language models don’t split cleanly into domain specialists; instead, a small, stable group of experts gets chosen again and again across many subjects.