DentalGPT is a special AI that looks at dental images and text together and explains what it sees like a junior dentist.
The paper asks how to best use expert step-by-step solutions (expert trajectories) when teaching big AI models to reason after pretraining.
This paper asks whether reinforcement learning (RL) can improve making 3D models from text and shows that the answer is yes if we design the training and rewards carefully.
Role-playing agents need to juggle several goals at once, like staying in character, following instructions, and using the right tone.
The paper shows that video AIs do not need long, human-like chains of thought to reason well.
Diffusion language models write by gradually unmasking hidden words, so deciding which blanks to reveal next is a big deal for both speed and accuracy.
This paper teaches a vision-language model to think about images by talking to copies of itself, using only words to plan and decide.
TreeGRPO teaches image generators using a smart branching tree so each training run produces many useful learning signals instead of just one.
The paper asks when reinforcement learning (RL) really makes language models better at reasoning beyond what they learned in pre-training.
The paper shows that making a model write a number as a sequence of digits and then grading the whole number at the end works better than grading each digit separately.
Large language models forget or misuse new facts if you only poke their weights once; EtCon fixes this with a two-step plan.
COOPER is a single AI model that both “looks better” (perceives depth and object boundaries) and “thinks smarter” (reasons step by step) to answer spatial questions about images.