ReFusion is a new way for AI to write text faster by planning in chunks (called slots) and then filling each chunk carefully.
This survey explains how AI agents remember things and organizes the whole topic into three clear parts: forms, functions, and dynamics.
Janus splits a Mixture-of-Experts (MoE) model into two parts—attention and experts—so each can use just the right amount of GPUs.
Seedance 1.5 pro is a single model that makes video and sound together at the same time, so lips, music, and actions match naturally.
Different programming languages scale differently when training code AI models, so treating them all the same wastes compute and lowers performance.
RecTok is a new visual tokenizer that teaches the whole training path of a diffusion model (the forward flow) to be smart about image meaning, not just the starting latent features.
This paper introduces DERL, a two-level learning system that automatically builds better reward functions for reinforcement learning agents.
FIN-bench-v2 is a big, tidy set of Finnish tests that checks how good large language models are at many things like reading, logic, and world knowledge.
KlingAvatar 2.0 is a system that makes long, sharp, lifelike talking-person videos that follow audio, images, and text instructions all at once.
ShowTable is a new way for AI to turn a data table into a beautiful, accurate infographic using a think–make–check–fix loop.
This paper builds a new test called Video Reality Test to see if AI-made ASMR videos can fool both people and AI video watchers (VLMs).
This paper teaches robots to move their camera to a better spot before answering a question about what they see.