๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts152

Groups

๐Ÿ“Linear Algebra15๐Ÿ“ˆCalculus & Differentiation10๐ŸŽฏOptimization14๐ŸŽฒProbability Theory12๐Ÿ“ŠStatistics for ML9๐Ÿ“กInformation Theory10๐Ÿ”บConvex Optimization7๐Ÿ”ขNumerical Methods6๐Ÿ•ธGraph Theory for Deep Learning6๐Ÿ”ตTopology for ML5๐ŸŒDifferential Geometry6โˆžMeasure Theory & Functional Analysis6๐ŸŽฐRandom Matrix Theory5๐ŸŒŠFourier Analysis & Signal Processing9๐ŸŽฐSampling & Monte Carlo Methods10๐Ÿง Deep Learning Theory12๐Ÿ›ก๏ธRegularization Theory11๐Ÿ‘๏ธAttention & Transformer Theory10๐ŸŽจGenerative Model Theory11๐Ÿ”ฎRepresentation Learning10๐ŸŽฎReinforcement Learning Mathematics9๐Ÿ”„Variational Methods8๐Ÿ“‰Loss Functions & Objectives10โฑ๏ธSequence & Temporal Models8๐Ÿ’ŽGeometric Deep Learning8

Category

๐Ÿ”ทAllโˆ‘Mathโš™๏ธAlgo๐Ÿ—‚๏ธDS๐Ÿ“šTheory

Level

AllBeginnerIntermediateAdvanced
๐Ÿ“šTheoryAdvanced

Spectral Analysis of Neural Networks

Spectral analysis studies the distribution of eigenvalues and singular values of neural network weight matrices during training.

#spectral analysis#eigenvalues#singular values+12
๐Ÿ“šTheoryIntermediate

Universal Approximation Theorems

The Universal Approximation Theorems say that a neural network with at least one hidden layer and a suitable activation can approximate any continuous function on a compact domain as closely as you like.

#universal approximation theorem
56789
#cybenko
#hornik
+12
๐Ÿ“šTheoryAdvanced

Reproducing Kernel Hilbert Spaces (RKHS)

An RKHS is a space of functions where evaluating a function at a point equals taking an inner product with a kernel section, which enables the โ€œkernel trick.โ€

#rkhs#kernel trick#gram matrix+12
๐Ÿ“šTheoryIntermediate

Topological Data Analysis (TDA)

Topological Data Analysis (TDA) studies the shape of data using tools from algebraic topology, producing summaries like Betti numbers, barcodes, and persistence diagrams.

#topological data analysis#persistent homology#vietorisโ€“rips complex+12
๐Ÿ“šTheoryIntermediate

Graph Isomorphism & WL Test

Graph isomorphism asks whether two graphs are the same up to renaming vertices; the Weisfeilerโ€“Leman (WL) test is a powerful heuristic that often distinguishes non-isomorphic graphs quickly.

#weisfeiler-leman#color refinement#graph isomorphism+10
๐Ÿ“šTheoryIntermediate

Message Passing Framework

Message Passing Neural Networks (MPNNs) learn on graphs by letting nodes repeatedly exchange and aggregate messages from their neighbors.

#message passing neural network#mpnn#graph neural network+12
๐Ÿ“šTheoryAdvanced

Maximum Entropy Principle

The Maximum Entropy Principle picks the probability distribution with the greatest uncertainty (entropy) that still satisfies the facts you know (constraints).

#maximum entropy principle#jaynes#exponential family+12
๐Ÿ“šTheoryAdvanced

Rate-Distortion Theory

Rateโ€“distortion theory tells you the minimum number of bits per symbol needed to represent data while keeping average distortion below a target D.

#rate-distortion#mutual information#blahut-arimoto+12
๐Ÿ“šTheoryAdvanced

Information Bottleneck

The Information Bottleneck (IB) principle formalizes the tradeoff between compressing an input X and preserving information about a target Y using the objective min_{p(t|x)} I(X;T) - \beta I(T;Y).

#information bottleneck#mutual information#kl divergence+12
๐Ÿ“šTheoryIntermediate

Cross-Entropy

Cross-entropy measures how well a proposed distribution Q predicts outcomes actually generated by a true distribution P.

#cross-entropy#entropy#kl divergence+12
๐Ÿ“šTheoryIntermediate

KL Divergence

KL divergence measures how much information is lost when using model Q to approximate the true distribution P.

#kl divergence#relative entropy#cross-entropy+12
๐Ÿ“šTheoryIntermediate

Empirical Risk Minimization

Empirical Risk Minimization (ERM) chooses a model that minimizes the average loss on the training data.

#empirical risk minimization#expected risk#loss function+12