How I Study AI - Learn AI Papers & Lectures the Easy Way

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 3: Architectures, Hyperparameters

Beginner

Stanford Online

Language modeling means predicting the next token (a token is a small piece of text like a word or subword) given all tokens before it. If you can estimate this next-token probability well, you can generate text by sampling one token at a time and appending it to the history. This step-by-step sampling turns probabilities into full sentences or paragraphs. Good models make these probabilities sharp for likely words and low for unlikely ones.

#language modeling#next-token prediction#embedding

🎬AI Lectures40

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 2: Pytorch, Resource Accounting

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 3: Architectures, Hyperparameters

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 1: Overview and Tokenization

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 5: GPUs

Chapter 1: Vectors, what even are they? | Essence of Linear Algebra

Chapter 12: A geometric interpretation of Cramer's rule | Essence of Linear Algebra

Chapter 6: The determinant | Essence of Linear Algebra

Chapter 5: Three-dimensional linear transformations | Essence of Linear Algebra

Chapter 9: Dot products and duality | Essence of Linear Algebra

Chapter 8: Nonsquare matrices as transformations between dimensions | Essence of Linear Algebra

Chapter 7: Inverse matrices, column space, and null space | Essence of Linear Algebra

Why visual understanding of linear algebra matters first