This paper teaches AI to solve diagram-based math problems by copying how people think: first see (perception), then make sense of what you saw (internalization), and finally reason (solve the problem).
DataFlow is a building-block system that helps large language models get better data by unifying how we create, clean, check, and organize that data.
This paper teaches a vision-language model to first find objects in real 3D space (not just 2D pictures) and then reason about where things are.
Skyra is a detective-style AI that spots tiny visual mistakes (artifacts) in videos to tell if they are real or AI-generated, and it explains its decision with times and places in the video.
Nemotron-Math is a giant math dataset with 7.5 million step-by-step solutions created in three thinking styles and with or without Python help.
OpenDataArena (ODA) is a fair, open platform that measures how valuable different post‑training datasets are for large language models by holding everything else constant.
Reasoning tokens (the words a model writes before its final answer) help the model think better, but they are not a trustworthy diary of how it really thought.
DentalGPT is a special AI that looks at dental images and text together and explains what it sees like a junior dentist.
The paper shows that video AIs do not need long, human-like chains of thought to reason well.
VG-Refiner is a new way for AI to find the right object in a picture when given a description, even if helper tools make mistakes.
Large language models forget or misuse new facts if you only poke their weights once; EtCon fixes this with a two-step plan.
ThreadWeaver teaches a language model to split big problems into smaller parts it can solve at the same time, like teammates working in parallel.