This paper fixes a big problem in long video-making AIs where the video keeps snapping back to the beginning, like a movie stuck on rewind.
Coding agents waste most of their tokens just reading giant files, which makes them slow and expensive.
LongCat-Flash-Thinking-2601 is a huge 560-billion-parameter Mixture-of-Experts model built to act like a careful helper that can use tools, browse, code, and solve multi-step tasks.
Videos are made of very long lists of tokens, and regular attention looks at every pair of tokens, which is slow and expensive.
Endless Terminals is an automatic factory that builds thousands of realistic, checkable computer-terminal tasks so AI agents can practice and improve with reinforcement learning.
DSGym is a unified 'gym' where AI data science agents are tested and trained by actually running code on real datasets, not just chatting about them.
Memory-V2V teaches video editing AIs to remember what they already changed so new edits stay consistent with old ones.
Large language models usually get judged one message at a time, but many real tasks need smart planning across a whole conversation.
This paper says modern video generators are starting to act like tiny "world simulators," not just pretty video painters.
Before this work, most text-to-image models used VAEs (small, squished image codes) and struggled with slow training and overfitting on high-quality fine-tuning sets.
IVRA is a simple, training-free add-on that helps robot brains keep the 2D shape of pictures while following language instructions.
This paper shows that giving an AI a safe, tiny virtual computer (a sandbox) lets it solve many kinds of problems better, not just coding ones.