This paper introduces DERL, a two-level learning system that automatically builds better reward functions for reinforcement learning agents.
FIN-bench-v2 is a big, tidy set of Finnish tests that checks how good large language models are at many things like reading, logic, and world knowledge.
KlingAvatar 2.0 is a system that makes long, sharp, lifelike talking-person videos that follow audio, images, and text instructions all at once.
ShowTable is a new way for AI to turn a data table into a beautiful, accurate infographic using a think–make–check–fix loop.
This paper builds a new test called Video Reality Test to see if AI-made ASMR videos can fool both people and AI video watchers (VLMs).
This paper teaches robots to move their camera to a better spot before answering a question about what they see.
Ships constantly broadcast AIS messages, but these messages are messy, unevenly spaced in time, and sometimes wrong.
FINCH is a new test that checks whether AI can handle real finance and accounting work using messy, real spreadsheets, emails, PDFs, charts, and more.
Robots often see the world as flat pictures but must move in a 3D world, which makes accurate actions hard.
GTR-Turbo teaches a vision-language agent using a 'free teacher' made by merging its own past checkpoints, so no costly external model is needed.
Big text-to-image models make amazing pictures but are slow because they take hundreds of tiny steps to turn noise into an image.
QwenLong-L1.5 is a training recipe that helps AI read and reason over very long documents by improving the data it learns from, the way it is trained, and how it remembers important stuff.