The paper builds a new way to create realistic, long conversations between people and AI that use tools like databases.
Solar Open is a giant bilingual AI (102 billion parameters) that focuses on helping underserved languages like Korean catch up with English-level AI quality.
X-Coder shows that models can learn expert-level competitive programming using data that is 100% synthetic—no real contest problems needed.
DataFlow is a building-block system that helps large language models get better data by unifying how we create, clean, check, and organize that data.
VOYAGER is a training-free way to make large language models (LLMs) create data that is truly different, not just slightly reworded.
The paper introduces M3DR, a way for computers to find the right document image no matter which of 22 languages the query or the document uses.