How I Study AI - Learn AI Papers & Lectures the Easy Way

LLM

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 10: Inference

Intermediate

Stanford Online

This session explains how to use a trained language model to produce outputs, a phase called inference. It covers three task types—conditional generation, open-ended generation, and classification—each with different input/output shapes that affect decoding choices. The lecture then dives into decoding methods, which are strategies to choose the next token step by step. Finally, it discusses how to evaluate generated text using human judgments and automatic metrics, along with their trade-offs.

#inference#decoding#greedy decoding

🎬AI Lectures8

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of Experts

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 12: Evaluation

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 11: Scaling Laws 2

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 10: Inference

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 9: Scaling Laws 1

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 6: Kernels, Triton

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 13: Data 1

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 14: Data 2