🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts15

Groups

📐Linear Algebra15📈Calculus & Differentiation10🎯Optimization14🎲Probability Theory12📊Statistics for ML9📡Information Theory10🔺Convex Optimization7🔢Numerical Methods6🕸Graph Theory for Deep Learning6🔵Topology for ML5🌐Differential Geometry6∞Measure Theory & Functional Analysis6🎰Random Matrix Theory5🌊Fourier Analysis & Signal Processing9🎰Sampling & Monte Carlo Methods10🧠Deep Learning Theory12🛡️Regularization Theory11👁️Attention & Transformer Theory10🎨Generative Model Theory11🔮Representation Learning10🎮Reinforcement Learning Mathematics9🔄Variational Methods8📉Loss Functions & Objectives10⏱️Sequence & Temporal Models8💎Geometric Deep Learning8

Category

🔷All∑Math⚙️Algo🗂️DS📚Theory

Level

AllBeginnerIntermediate
∑MathIntermediate

Surrogate Loss Theory

0-1 loss directly measures classification error but is discontinuous and non-convex, making optimization computationally hard.

#surrogate loss#0-1 loss#hinge loss+12
∑MathIntermediate

Elastic Net Regularization

Elastic Net regularization combines L1 (Lasso) and L2 (Ridge) penalties to produce models that are both sparse and stable.

#elastic net#lasso
12
Advanced
Filtering by:
#convex optimization
#ridge regression
+12
∑MathIntermediate

L1 Regularization (Lasso)

L1 regularization (Lasso) adds a penalty \(\lambda \sum_{i=1}^{p} |w_i|\) to the loss, which pushes many coefficients exactly to zero and performs feature selection.

#lasso#l1 regularization#soft-thresholding+12
⚙️AlgorithmAdvanced

ADMM (Alternating Direction Method of Multipliers)

ADMM splits a hard optimization problem into two easier subproblems that communicate through simple averaging-like steps.

#admm#alternating direction method of multipliers#augmented lagrangian+11
⚙️AlgorithmIntermediate

Proximal Operators & Methods

A proximal operator pulls a point x toward minimizing a function f while penalizing how far it moves, acting like a denoiser or projector depending on f.

#proximal operator#ista#fista+12
∑MathAdvanced

KKT Conditions

KKT conditions generalize Lagrange multipliers to handle inequality constraints in constrained optimization problems.

#kkt conditions#lagrangian#complementary slackness+12
∑MathIntermediate

Convex Optimization Problems

A convex optimization problem minimizes a convex function over a convex set, guaranteeing that every local minimum is a global minimum.

#convex optimization#gradient descent#projected gradient+12
📚TheoryAdvanced

Maximum Entropy Principle

The Maximum Entropy Principle picks the probability distribution with the greatest uncertainty (entropy) that still satisfies the facts you know (constraints).

#maximum entropy principle#jaynes#exponential family+12
⚙️AlgorithmIntermediate

Momentum Methods

Momentum methods add an exponentially weighted memory of past gradients to make descent steps smoother and faster, especially in ravines and ill-conditioned problems.

#momentum#heavy-ball#polyak momentum+12
⚙️AlgorithmIntermediate

Stochastic Gradient Descent (SGD)

Stochastic Gradient Descent (SGD) updates model parameters using small random subsets (mini-batches) of data, making learning faster and more memory-efficient.

#stochastic gradient descent#mini-batch#random shuffling+12
⚙️AlgorithmIntermediate

Gradient Descent

Gradient descent is a simple, repeatable way to move downhill on a loss surface by stepping in the opposite direction of the gradient.

#gradient descent#batch gradient descent#learning rate+12
📚TheoryIntermediate

Lagrangian Duality

Lagrangian duality turns a constrained minimization problem into a related maximization problem that provides lower bounds on the original objective.

#lagrangian duality#kkt conditions#slater condition+11