The paper discovers a tiny, special group of neurons inside large language models (LLMs) that act like a reward system in the human brain.
The paper shows that when vision-language models write captions, only a small set of uncertain words (about 20%) act like forks that steer the whole sentence.