This paper shows how to train big language models faster and cheaper by using 4-bit numbers (NVFP4) without losing much accuracy.
VisionTrim makes picture-and-text AI models run much faster by keeping only the most useful visual pieces (tokens) and smartly merging the rest.
Large language models sometimes reach the right answer for the wrong reasons, which is risky and confusing.
Real attackers can try many prompts in parallel until a model slips, so testing safety with only one try badly underestimates risk.
TTCS is a way for a model to teach itself during the test by first making easier practice questions that are similar to the real hard question and then learning from them.
Big models are often used to grade AI answers, but they are expensive, slow, and depend too much on tricky prompts.
Most reinforcement learning agents only get a simple pass/fail reward, which hides how good or bad their attempts really were.
RAPTOR is a simple, fast way to find a direction (a concept vector) inside a frozen language model that points toward a concept like 'sarcasm' or 'positivity.'
This paper shows how to make a whole picture in one go, directly in pixels, without using a hidden “latent” space or many tiny steps.
Millions of public AI models exist, but downloads are concentrated on a tiny set of “official” checkpoints, which are not always the best performers.
This paper shows how to turn a big Transformer model into a faster hybrid model that mixes attention and RNN layers using far less training data (about 2.3B tokens).
The paper teaches AI agents better by grading not just their final answers, but also how they think and use tools along the way.