This paper teaches a video-understanding AI to think in 3D plus time (4D) so it can answer questions about specific objects moving in videos.
Skyra is a detective-style AI that spots tiny visual mistakes (artifacts) in videos to tell if they are real or AI-generated, and it explains its decision with times and places in the video.
This paper is about making the words you type into a generator turn into the right pictures and videos more reliably.
MMGR is a new benchmark that checks whether AI image and video generators follow real-world rules, not just whether their outputs look pretty.
ShowTable is a new way for AI to turn a data table into a beautiful, accurate infographic using a think–make–check–fix loop.
DentalGPT is a special AI that looks at dental images and text together and explains what it sees like a junior dentist.
The paper shows that video AIs do not need long, human-like chains of thought to reason well.