Large language models get smarter when they get bigger, but storing all those extra weights eats tons of memory.
RecGPT‑V2 turns a recommender system into a smart team: a planner, several specialists, and a fair judge that all work together.
This paper builds A4-Agent, a smart three-part helper that figures out where to touch or use an object just from a picture and a written instruction, without any extra training.
Large language models usually line words up in fixed order slots, which can waste mental energy and make it harder to find the important parts of a long or noisy text.
Vector Prism helps computers animate SVG images by first discovering which tiny shapes belong together as meaningful parts.
SS4D is a new AI model that turns a short single-camera video into a full 3D object that moves over time (that’s 4D), and it does this in about 2 minutes.
Zoom-Zero helps AI answer questions about videos by first finding the right moment and then zooming in to double-check tiny details.
Reinforcement learning agents often see the world in straight, flat space (Euclidean), but many decision problems look more like branching trees that fit curved, hyperbolic space better.
SonicMoE makes Mixture-of-Experts (MoE) models train faster and use less memory by redesigning how data is moved and computed on GPUs.
Autoregressive (AR) models write one word at a time, which is accurate but slow, especially when your computer or GPU can’t keep many tasks in memory at once.
HyperVL is a small but smart model that understands images and text, designed to run fast on phones and tablets.
OpenDataArena (ODA) is a fair, open platform that measures how valuable different post‑training datasets are for large language models by holding everything else constant.