This paper says long chain-of-thought (Long CoT) works best when it follows a 'molecular' pattern with three kinds of thinking bonds: Deep-Reasoning, Self-Reflection, and Self-Exploration.
VideoAR is a new way to make videos with AI that writes each frame like a story, one step at a time, while painting details from coarse to fine.
Machine learning agents usually improve by writing code, running it for hours, and then using the results to tweak the next try, which is very slow.
LLMs can look confident but still change their answers when the surrounding text nudges them, showing that confidence alone isn’t real truthfulness.
Preference tuning teaches language models to act the way people like, but those habits can fall apart when the topic or style changes (domain shift).
Video models can now be told what physical result you want (like “make this ball move left with a strong push”) using Goal Force, instead of just vague text or a final picture.
This paper shows that the best VAEs for image generation are the ones whose latents neatly separate object attributes, a property called semantic disentanglement.
EnvScaler is an automatic factory that builds many safe, rule-following practice worlds where AI agents can talk to users and call tools, just like real apps.
PaCoRe is a way for AI to think in many parallel paths and then coordinate them, so it can use a lot more brainpower at test time without running out of context window space.
This paper teaches an AI model to understand both which way an object is facing (orientation) and how it turns between views (rotation), all in one system.
FinVault is a new test that checks if AI helpers for finance stay safe while actually doing real jobs, not just chatting.
The paper shows that language models with a search tool often look up too much information, which wastes compute and can make answers worse on unanswerable questions.