Deep LearningThis session teaches two essentials for building language models: PyTorch basics and resource accounting. PyTorch is a library for working with tensors (multiβdimensional arrays) and can run on CPU or GPU. You learn how to create tensors, perform math (including matrix multiplies), reshape, index/slice, and use automatic differentiation to compute gradients for training.
LLMLanguage modeling means predicting the next token (a token is a small piece of text like a word or subword) given all tokens before it. If you can estimate this next-token probability well, you can generate text by sampling one token at a time and appending it to the history. This step-by-step sampling turns probabilities into full sentences or paragraphs. Good models make these probabilities sharp for likely words and low for unlikely ones.