๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts8

Groups

๐Ÿ“Linear Algebra15๐Ÿ“ˆCalculus & Differentiation10๐ŸŽฏOptimization14๐ŸŽฒProbability Theory12๐Ÿ“ŠStatistics for ML9๐Ÿ“กInformation Theory10๐Ÿ”บConvex Optimization7๐Ÿ”ขNumerical Methods6๐Ÿ•ธGraph Theory for Deep Learning6๐Ÿ”ตTopology for ML5๐ŸŒDifferential Geometry6โˆžMeasure Theory & Functional Analysis6๐ŸŽฐRandom Matrix Theory5๐ŸŒŠFourier Analysis & Signal Processing9๐ŸŽฐSampling & Monte Carlo Methods10๐Ÿง Deep Learning Theory12๐Ÿ›ก๏ธRegularization Theory11๐Ÿ‘๏ธAttention & Transformer Theory10๐ŸŽจGenerative Model Theory11๐Ÿ”ฎRepresentation Learning10๐ŸŽฎReinforcement Learning Mathematics9๐Ÿ”„Variational Methods8๐Ÿ“‰Loss Functions & Objectives10โฑ๏ธSequence & Temporal Models8๐Ÿ’ŽGeometric Deep Learning8

Category

๐Ÿ”ทAllโˆ‘Mathโš™๏ธAlgo๐Ÿ—‚๏ธDS๐Ÿ“šTheory

Level

AllBeginnerIntermediate
โˆ‘MathIntermediate

Law of Large Numbers

The Weak Law of Large Numbers (WLLN) says that the sample average of independent, identically distributed (i.i.d.) random variables with finite mean gets close to the true mean with high probability as the sample size grows.

#law of large numbers#weak law#sample mean+12
โš™๏ธAlgorithmIntermediate

Distributed & Parallel Optimization

Data parallelism splits the training data across workers that compute gradients in parallel on a shared model.

#data parallelism
Advanced
Filtering by:
#variance reduction
#synchronous sgd
#asynchronous sgd
+12
โš™๏ธAlgorithmIntermediate

Stratified & Latin Hypercube Sampling

Stratified sampling reduces Monte Carlo variance by dividing the domain into non-overlapping regions (strata) and sampling within each region.

#stratified sampling#latin hypercube sampling#variance reduction+11
โš™๏ธAlgorithmIntermediate

Importance Sampling

Importance sampling rewrites an expectation under a hard-to-sample distribution p as an expectation under an easier distribution q, multiplied by a weight w = p/q.

#importance sampling#proposal distribution#self-normalized+12
โš™๏ธAlgorithmIntermediate

Monte Carlo Estimation

Monte Carlo estimation approximates an expected value by averaging function values at random samples drawn from a probability distribution.

#monte carlo#expectation#variance reduction+12
โš™๏ธAlgorithmIntermediate

Numerical Integration & Monte Carlo

Numerical integration approximates the area under a curve when an exact antiderivative is unknown, using deterministic quadrature rules or random sampling (Monte Carlo).

#numerical integration#quadrature#trapezoidal rule+11
โš™๏ธAlgorithmIntermediate

Stochastic Gradient Descent (SGD)

Stochastic Gradient Descent (SGD) updates model parameters using small random subsets (mini-batches) of data, making learning faster and more memory-efficient.

#stochastic gradient descent#mini-batch#random shuffling+12
๐Ÿ“šTheoryIntermediate

Bias-Variance Tradeoff

The biasโ€“variance tradeoff explains how prediction error splits into bias squared, variance, and irreducible noise for squared loss.

#bias variance tradeoff#mse decomposition#polynomial regression+12