🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts5

Groups

📐Linear Algebra15📈Calculus & Differentiation10🎯Optimization14🎲Probability Theory12📊Statistics for ML9📡Information Theory10🔺Convex Optimization7🔢Numerical Methods6🕸Graph Theory for Deep Learning6🔵Topology for ML5🌐Differential Geometry6∞Measure Theory & Functional Analysis6🎰Random Matrix Theory5🌊Fourier Analysis & Signal Processing9🎰Sampling & Monte Carlo Methods10🧠Deep Learning Theory12🛡️Regularization Theory11👁️Attention & Transformer Theory10🎨Generative Model Theory11🔮Representation Learning10🎮Reinforcement Learning Mathematics9🔄Variational Methods8📉Loss Functions & Objectives10⏱️Sequence & Temporal Models8💎Geometric Deep Learning8

Category

🔷All∑Math⚙️Algo🗂️DS📚Theory

Level

AllBeginnerIntermediate
∑MathIntermediate

Surrogate Loss Theory

0-1 loss directly measures classification error but is discontinuous and non-convex, making optimization computationally hard.

#surrogate loss#0-1 loss#hinge loss+12
📚TheoryIntermediate

Empirical Risk Minimization

Empirical Risk Minimization (ERM) chooses a model that minimizes the average loss on the training data.

#empirical risk minimization
Advanced
Filtering by:
#empirical risk minimization
#expected risk
#loss function
+12
⚙️AlgorithmIntermediate

Stochastic Gradient Descent (SGD)

Stochastic Gradient Descent (SGD) updates model parameters using small random subsets (mini-batches) of data, making learning faster and more memory-efficient.

#stochastic gradient descent#mini-batch#random shuffling+12
📚TheoryAdvanced

Statistical Learning Theory

Statistical learning theory explains why a model that fits training data can still predict well on unseen data by relating true risk to empirical risk plus a complexity term.

#statistical learning theory#empirical risk minimization#structural risk minimization+11
📚TheoryIntermediate

PAC Learning

PAC learning formalizes when a learner can probably (with probability at least 1−δ) and approximately (error at most ε) succeed using a polynomial number of samples.

#pac learning#agnostic learning#vc dimension+12