SCOPE lets AI agents rewrite their own instructions while they are working, so they can fix mistakes and get smarter on the next step, not just the next task.
Repeating the entire prompt once (QUERY→QUERY+QUERY) helps many large language models answer better when you are not asking them to show their reasoning.
HERBench is a new test that checks if video AI models can combine several clues spread across time, not just guess from one frame or language priors.
Sparse-LaViDa makes diffusion-style AI models much faster by skipping unhelpful masked tokens during generation while keeping quality the same.
Olmo 3 is a family of fully-open AI language models (7B and 32B) where every step—from raw data to training code and checkpoints—is released.
Diffusion Preview is a two-step “preview-then-refine” workflow that shows you a fast draft image first and only spends full compute after you like the draft.
The paper shows that judging vector search only by distance-based recall and speed can be very misleading for real tasks.
Scone is a new AI method that makes images from instructions while correctly picking the right subject even when many look similar.
The FACTS Leaderboard is a four-part test that checks how truthful AI models are across images, memory, web search, and document grounding.
SHARP turns a single photo into a 3D scene you can look around in, and it does this in under one second on a single GPU.
This paper introduces the Confucius Code Agent (CCA), a coding helper built to handle huge real-world codebases with long tasks and many tools.
This paper creates MotionEdit, a high-quality dataset that teaches AI to change how people and objects move in a picture without breaking their looks or the scene.