Over++ is a video AI that adds realistic effects like shadows, splashes, dust, and smoke between a foreground and a background without changing the original footage.
StoryMem is a new way to make minute‑long, multi‑shot videos that keep the same characters, places, and style across many clips.
CASA is a new way to mix images and text inside a language model that keeps speed and memory low while keeping accuracy high.
QuantiPhy is a new test that checks if AI models can measure real-world physics from videos using numbers, not guesses.
QuCo-RAG is a new way to decide when an AI should look things up while it writes, using facts from its training data instead of its own shaky confidence.
DramaBench is a new test that checks how well AI continues drama scripts across six separate skills instead of one big score.
This paper asks a simple question with big impact: Can AI tell which test questions are hard for humans?
This paper asks if large language models (LLMs) can act like "world models" that predict what happens next in text-based environments, not just the next word in a sentence.
This paper builds a tough new test called O3-BENCH to check if AI can truly think with images, not just spot objects.
Capitalization tie-out checks if a company’s ownership table truly matches what its legal documents say.
SWE-EVO is a new test (benchmark) that checks if AI coding agents can upgrade real software projects over many steps, not just fix one small bug.
MatSpray turns 2D guesses about what materials look like (color, shininess, metal) into a clean 3D model you can relight realistically.