🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts6

Groups

📐Linear Algebra15📈Calculus & Differentiation10🎯Optimization14🎲Probability Theory12📊Statistics for ML9📡Information Theory10🔺Convex Optimization7🔢Numerical Methods6🕸Graph Theory for Deep Learning6🔵Topology for ML5🌐Differential Geometry6∞Measure Theory & Functional Analysis6🎰Random Matrix Theory5🌊Fourier Analysis & Signal Processing9🎰Sampling & Monte Carlo Methods10🧠Deep Learning Theory12🛡️Regularization Theory11👁️Attention & Transformer Theory10🎨Generative Model Theory11🔮Representation Learning10🎮Reinforcement Learning Mathematics9🔄Variational Methods8📉Loss Functions & Objectives10⏱️Sequence & Temporal Models8💎Geometric Deep Learning8

Category

🔷All∑Math⚙️Algo🗂️DS📚Theory

Level

AllBeginnerIntermediate
⚙️AlgorithmIntermediate

Mixed Precision Training

Mixed precision training stores and computes tensors in low precision (FP16/BF16) for speed and memory savings while keeping a master copy of weights in FP32 for accurate updates.

#mixed precision#fp16#bf16+10
📚TheoryIntermediate

Multi-Head Attention

Multi-Head Attention runs several attention mechanisms in parallel so each head can focus on different relationships in the data.

#multi-head attention
Advanced
Filtering by:
#matrix multiplication
#scaled dot-product attention
#transformer
+12
📚TheoryIntermediate

Scaled Dot-Product Attention

Scaled dot-product attention scores how much each value V should contribute to a query by taking dot products with keys K, scaling by \(\sqrt{d_k}\), applying softmax, and forming a weighted sum.

#scaled dot-product attention#softmax#transformer+10
∑MathIntermediate

Multivariable Chain Rule

The multivariable chain rule explains how rates of change pass through a pipeline of functions by multiplying the right derivatives (Jacobians) in the right order.

#multivariable chain rule#jacobian#gradient+12
∑MathIntermediate

Matrix Operations & Properties

Matrix operations like multiplication and transpose combine or reorient data tables and linear transformations in predictable ways.

#matrix multiplication#transpose#trace+12
📚TheoryIntermediate

Parallel Algorithm Theory

Parallel algorithm theory studies how to solve problems faster by coordinating many processors that share work and memory.

#pram#work-span#parallel prefix sum+12