Real attackers can try many prompts in parallel until a model slips, so testing safety with only one try badly underestimates risk.
TTCS is a way for a model to teach itself during the test by first making easier practice questions that are similar to the real hard question and then learning from them.
Big models are often used to grade AI answers, but they are expensive, slow, and depend too much on tricky prompts.
Most reinforcement learning agents only get a simple pass/fail reward, which hides how good or bad their attempts really were.
RAPTOR is a simple, fast way to find a direction (a concept vector) inside a frozen language model that points toward a concept like 'sarcasm' or 'positivity.'
This paper shows how to make a whole picture in one go, directly in pixels, without using a hidden “latent” space or many tiny steps.
Millions of public AI models exist, but downloads are concentrated on a tiny set of “official” checkpoints, which are not always the best performers.
This paper shows how to turn a big Transformer model into a faster hybrid model that mixes attention and RNN layers using far less training data (about 2.3B tokens).
The paper teaches AI agents better by grading not just their final answers, but also how they think and use tools along the way.
DynamicVLA is a small and fast robot brain that sees, reads, and acts while things are moving.
Large language models usually learn by guessing the next word, then get a tiny bit of instruction-following practice; this paper flips that by turning massive web documents into instruction-and-answer pairs at huge scale.
This paper shows a simple, one-model way to dub videos that makes the new voice and the lips move together naturally.