MegaFlow is a new system that helps thousands of AI agents practice and test big, messy tasks (like fixing real software bugs) all at once without crashing or wasting money.
OpenTinker is an open-source system that makes training AI agents with reinforcement learning simple, modular, and reusable.
Diffusion Language Models (DLMs) write by polishing whole sentences in several passes instead of one token at a time.
The paper introduces Controlled Self-Evolution (CSE), a smarter way for AI to write and improve code quickly under a tight budget of tries.
VideoLoom is a single AI model that can tell both when something happens in a video and where it happens, at the pixel level.
Image-to-Video models often keep the picture looking right but ignore parts of the text instructions.
MeepleLM is a special AI that reads a board game’s rulebook and pretends to be different kinds of players to give helpful, honest feedback.
The paper shows that when we give AI lots of extra text, even harmless extra text, it can get badly confused—sometimes losing up to 80% of its accuracy.
Dr. Zero is a pair of AI agents (a Proposer and a Solver) that teach each other to do web-search-based reasoning without any human-written training data.
Solar Open is a giant bilingual AI (102 billion parameters) that focuses on helping underserved languages like Korean catch up with English-level AI quality.
X-Coder shows that models can learn expert-level competitive programming using data that is 100% synthetic—no real contest problems needed.
VideoDR is a new benchmark that tests if AI can watch a video, pull out key visual clues, search the open web, and chain the clues together to find one verifiable answer.