How I Study AI - Learn AI Papers & Lectures the Easy Way

LLM

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 3: Architectures, Hyperparameters

Beginner

Stanford Online

Language modeling means predicting the next token (a token is a small piece of text like a word or subword) given all tokens before it. If you can estimate this next-token probability well, you can generate text by sampling one token at a time and appending it to the history. This step-by-step sampling turns probabilities into full sentences or paragraphs. Good models make these probabilities sharp for likely words and low for unlikely ones.

#language modeling#next-token prediction#embedding

🎬AI Lectures1

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 3: Architectures, Hyperparameters