🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

📝Daily Log 🎯Prompts 🧠Review

Search Settings

How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts152

Groups

📐Linear Algebra15 📈Calculus & Differentiation10 🎯Optimization14 🎲Probability Theory12 📊Statistics for ML9 📡Information Theory10 🔺Convex Optimization7 🔢Numerical Methods6 🕸Graph Theory for Deep Learning6 🔵Topology for ML5 🌐Differential Geometry6 ∞Measure Theory & Functional Analysis6 🎰Random Matrix Theory5 🌊Fourier Analysis & Signal Processing9 🎰Sampling & Monte Carlo Methods10 🧠Deep Learning Theory12 🛡️Regularization Theory11 👁️Attention & Transformer Theory10 🎨Generative Model Theory11 🔮Representation Learning10 🎮Reinforcement Learning Mathematics9 🔄Variational Methods8 📉Loss Functions & Objectives10 ⏱️Sequence & Temporal Models8 💎Geometric Deep Learning8

Category

🔷All ∑Math ⚙️Algo 🗂️DS 📚Theory

Level

All Beginner Intermediate Advanced

📚TheoryAdvanced

Transformer Theory

Transformers map sequences to sequences using layers of self-attention and feed-forward networks wrapped with residual connections and LayerNorm.

#transformer#self-attention#positional encoding+12

📚TheoryAdvanced

Reinforcement Learning Theory

Reinforcement Learning (RL) studies how an agent learns to act in an environment to maximize long-term cumulative reward.

#reinforcement learning

#mdp

#bellman equation

+12

📚TheoryAdvanced

Neural Tangent Kernel (NTK) Theory

The Neural Tangent Kernel (NTK) connects very wide neural networks to classical kernel methods, letting us study training as if it were kernel regression.

#neural tangent kernel#ntk#infinite width+12

📚TheoryIntermediate

Attention Mechanism Theory

Attention computes a weighted sum of values V where the weights come from how similar queries Q are to keys K.

#attention#self-attention#multi-head attention+12

📚TheoryIntermediate

Scaling Laws

Scaling laws say that model loss typically follows a power law that improves predictably as you increase parameters, data, or compute.

#scaling laws#power law#chinchilla scaling+12

📚TheoryAdvanced

Calculus of Variations

Calculus of variations optimizes functionals—numbers produced by whole functions—rather than ordinary functions of numbers.

#calculus of variations#euler–lagrange#functional derivative+12

📚TheoryAdvanced

Deep Learning Generalization Theory

Deep learning generalization theory tries to explain why overparameterized networks can fit (interpolate) training data yet still perform well on new data.

#generalization#implicit regularization#minimum norm+12

📚TheoryAdvanced

Neural Network Expressivity

Neural network expressivity studies what kinds of functions different network architectures can represent and how efficiently they can do so.

#neural network expressivity#depth separation#relu linear regions+12

📚TheoryAdvanced

Statistical Learning Theory

Statistical learning theory explains why a model that fits training data can still predict well on unseen data by relating true risk to empirical risk plus a complexity term.

#statistical learning theory#empirical risk minimization#structural risk minimization+11

📚TheoryIntermediate

Universal Approximation Theorem

The Universal Approximation Theorem (UAT) says a feedforward neural network with one hidden layer and a non-polynomial activation (like sigmoid or ReLU) can approximate any continuous function on a compact set as closely as we want.

#universal approximation theorem#cybenko#hornik+12

📚TheoryIntermediate

Minimax Theorem

The Minimax Theorem states that in zero-sum two-player games with suitable convexity and compactness, the best guaranteed payoff for the maximizer equals the worst-case loss for the minimizer.

#minimax theorem#zero-sum games#saddle point+12

📚TheoryIntermediate

PAC Learning

PAC learning formalizes when a learner can probably (with probability at least 1−δ) and approximately (error at most ε) succeed using a polynomial number of samples.

#pac learning#agnostic learning#vc dimension+12