🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts152

Groups

📐Linear Algebra15📈Calculus & Differentiation10🎯Optimization14🎲Probability Theory12📊Statistics for ML9📡Information Theory10🔺Convex Optimization7🔢Numerical Methods6🕸Graph Theory for Deep Learning6🔵Topology for ML5🌐Differential Geometry6∞Measure Theory & Functional Analysis6🎰Random Matrix Theory5🌊Fourier Analysis & Signal Processing9🎰Sampling & Monte Carlo Methods10🧠Deep Learning Theory12🛡️Regularization Theory11👁️Attention & Transformer Theory10🎨Generative Model Theory11🔮Representation Learning10🎮Reinforcement Learning Mathematics9🔄Variational Methods8📉Loss Functions & Objectives10⏱️Sequence & Temporal Models8💎Geometric Deep Learning8

Category

🔷All∑Math⚙️Algo🗂️DS📚Theory

Level

AllBeginnerIntermediateAdvanced
📚TheoryIntermediate

Autoregressive Models

Autoregressive (AR) models represent a joint distribution by multiplying conditional probabilities in a fixed order, using the chain rule of probability.

#autoregressive#ar model#n-gram+11
📚TheoryAdvanced

Diffusion Models (Score-Based)

Score-based diffusion models corrupt data by gradually adding Gaussian noise and then learn to reverse this process by estimating the score, the gradient of the log-density.

#diffusion models
23456
#score-based modeling
#ddpm
+7
📚TheoryAdvanced

Normalizing Flows

Normalizing flows transform a simple base distribution (like a standard Gaussian) into a complex target distribution using a chain of invertible functions.

#normalizing flows#change of variables#jacobian determinant+12
📚TheoryAdvanced

GAN Theory & Training Dynamics

GANs frame learning as a two-player game where a generator tries to fool a discriminator, and the discriminator tries to detect fakes.

#gan#generator#discriminator+12
📚TheoryAdvanced

Variational Autoencoders (VAE) Theory

A Variational Autoencoder (VAE) is a probabilistic autoencoder that learns to generate data by inferring hidden causes (latent variables) and decoding them back to observations.

#variational autoencoder#elbo#kl divergence+12
📚TheoryIntermediate

Maximum Likelihood & Generative Models

Maximum Likelihood Estimation (MLE) picks parameters that make the observed data most probable under a chosen probabilistic model.

#maximum likelihood#generative models#naive bayes+12
📚TheoryIntermediate

Mixture of Experts (MoE)

A Mixture of Experts (MoE) routes each input to a small subset of specialized models called experts, enabling conditional computation.

#mixture of experts#moe#gating network+12
📚TheoryIntermediate

Key-Value Memory Systems

Key-Value memory systems store information as pairs where keys are used to look up values by similarity rather than exact match.

#key-value memory#attention#scaled dot-product+12
📚TheoryAdvanced

In-Context Learning Theory

In-context learning (ICL) means a model learns from examples provided in the input itself, without updating its parameters.

#in-context learning#transformer#attention+12
📚TheoryAdvanced

Transformer Expressiveness

Transformer expressiveness studies what kinds of sequence-to-sequence mappings a Transformer can represent or approximate.

#transformer expressiveness#universal approximation#self-attention+12
📚TheoryIntermediate

Self-Attention as Graph Neural Network

Self-attention can be viewed as message passing on a fully connected graph where each token (node) sends a weighted message to every other token.

#self-attention#graph neural network#message passing+11
📚TheoryIntermediate

Multi-Head Attention

Multi-Head Attention runs several attention mechanisms in parallel so each head can focus on different relationships in the data.

#multi-head attention#scaled dot-product attention#transformer+12