๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts187

Groups

๐Ÿ“Linear Algebra15๐Ÿ“ˆCalculus & Differentiation10๐ŸŽฏOptimization14๐ŸŽฒProbability Theory12๐Ÿ“ŠStatistics for ML9๐Ÿ“กInformation Theory10๐Ÿ”บConvex Optimization7๐Ÿ”ขNumerical Methods6๐Ÿ•ธGraph Theory for Deep Learning6๐Ÿ”ตTopology for ML5๐ŸŒDifferential Geometry6โˆžMeasure Theory & Functional Analysis6๐ŸŽฐRandom Matrix Theory5๐ŸŒŠFourier Analysis & Signal Processing9๐ŸŽฐSampling & Monte Carlo Methods10๐Ÿง Deep Learning Theory12๐Ÿ›ก๏ธRegularization Theory11๐Ÿ‘๏ธAttention & Transformer Theory10๐ŸŽจGenerative Model Theory11๐Ÿ”ฎRepresentation Learning10๐ŸŽฎReinforcement Learning Mathematics9๐Ÿ”„Variational Methods8๐Ÿ“‰Loss Functions & Objectives10โฑ๏ธSequence & Temporal Models8๐Ÿ’ŽGeometric Deep Learning8

Category

๐Ÿ”ทAllโˆ‘Mathโš™๏ธAlgo๐Ÿ—‚๏ธDS๐Ÿ“šTheory

Level

AllBeginnerIntermediateAdvanced
โš™๏ธAlgorithmIntermediate

Mixed Precision Training

Mixed precision training stores and computes tensors in low precision (FP16/BF16) for speed and memory savings while keeping a master copy of weights in FP32 for accurate updates.

#mixed precision#fp16#bf16+10
โš™๏ธAlgorithmIntermediate

Distributed & Parallel Optimization

Data parallelism splits the training data across workers that compute gradients in parallel on a shared model.

#data parallelism
12345
#synchronous sgd
#asynchronous sgd
+12
โš™๏ธAlgorithmIntermediate

Lion Optimizer

Lion (Evolved Sign Momentum) is a first-order, sign-based optimizer discovered through automated program search.

#lion optimizer#sign-based optimization#momentum+12
โš™๏ธAlgorithmIntermediate

Sharpness-Aware Minimization (SAM)

Sharpness-Aware Minimization (SAM) trains models to perform well even when their weights are slightly perturbed, seeking flatter minima that generalize better.

#sharpness-aware minimization#sam optimizer#robust optimization+11
โš™๏ธAlgorithmIntermediate

Sparse Matrices & Computation

A sparse matrix stores only its nonzero entries, saving huge amounts of memory when most entries are zero.

#sparse matrix#csr#csc+12
โš™๏ธAlgorithmIntermediate

Dynamic Time Warping

Dynamic Time Warping (DTW) aligns two time series that may vary in speed to find the minimum-cost correspondence between their elements.

#dynamic time warping#dtw c++#time series alignment+11
โš™๏ธAlgorithmAdvanced

Wake-Sleep Algorithm

The Wakeโ€“Sleep algorithm trains a pair of models: a generative model that explains how data are produced and a recognition model that guesses hidden causes from observed data.

#wake-sleep#helmholtz machine#generative model+12
โš™๏ธAlgorithmIntermediate

Expectation Maximization (EM)

Expectation Maximization (EM) is an iterative algorithm to estimate parameters when some variables are hidden or unobserved.

#expectation maximization#em algorithm#e-step+12
โš™๏ธAlgorithmAdvanced

Stochastic Variational Inference

Stochastic Variational Inference (SVI) scales variational inference to large datasets by taking noisy but unbiased gradient steps using minibatches.

#stochastic variational inference#elbo#variational inference+12
โš™๏ธAlgorithmIntermediate

PPO & Trust Region Methods

Proximal Policy Optimization (PPO) stabilizes policy gradient learning by preventing each update from moving the policy too far from the previous one.

#ppo#trust region#trpo+11
โš™๏ธAlgorithmIntermediate

Temporal Difference Learning

Temporal Difference (TD) Learning updates value estimates by bootstrapping from the next state's current estimate, enabling fast, online learning.

#temporal difference learning#td(0)#sarsa+12
โš™๏ธAlgorithmIntermediate

t-SNE & UMAP

t-SNE and UMAP are nonlinear dimensionality-reduction methods that preserve local neighborhoods to make high-dimensional data visible in 2D or 3D.

#t-sne#umap#dimensionality reduction+12