How I Study AI - Learn AI Papers & Lectures the Easy Way

LLM

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 10: Inference

Intermediate

Stanford Online

This session explains how to use a trained language model to produce outputs, a phase called inference. It covers three task types—conditional generation, open-ended generation, and classification—each with different input/output shapes that affect decoding choices. The lecture then dives into decoding methods, which are strategies to choose the next token step by step. Finally, it discusses how to evaluate generated text using human judgments and automatic metrics, along with their trade-offs.

#inference#decoding#greedy decoding

🎬AI Lectures55

Stanford CS329H: ML from Human Preferences | Autumn 2024 | Model-based Preference Optimization

Stanford CS329H: Machine Learning from Human Preferences | Autumn 2024 | Mechanism Design

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 2: Pytorch, Resource Accounting

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 3: Architectures, Hyperparameters

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 1: Overview and Tokenization

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of Experts

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 12: Evaluation

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 11: Scaling Laws 2

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 5: GPUs

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 7: Parallelism 1

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 10: Inference

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 9: Scaling Laws 1