🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

📝Daily Log 🎯Prompts 🧠Review

Search Settings

How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts6

Groups

📐Linear Algebra15 📈Calculus & Differentiation10 🎯Optimization14 🎲Probability Theory12 📊Statistics for ML9 📡Information Theory10 🔺Convex Optimization7 🔢Numerical Methods6 🕸Graph Theory for Deep Learning6 🔵Topology for ML5 🌐Differential Geometry6 ∞Measure Theory & Functional Analysis6 🎰Random Matrix Theory5 🌊Fourier Analysis & Signal Processing9 🎰Sampling & Monte Carlo Methods10 🧠Deep Learning Theory12 🛡️Regularization Theory11 👁️Attention & Transformer Theory10 🎨Generative Model Theory11 🔮Representation Learning10 🎮Reinforcement Learning Mathematics9 🔄Variational Methods8 📉Loss Functions & Objectives10 ⏱️Sequence & Temporal Models8 💎Geometric Deep Learning8

Category

🔷All ∑Math ⚙️Algo 🗂️DS 📚Theory

Level

All Beginner Intermediate

📚TheoryIntermediate

Knowledge Distillation Loss

Knowledge distillation loss blends standard hard-label cross-entropy with a soft distribution match from a teacher using a temperature parameter.

#knowledge distillation#kd loss#temperature scaling+12

📚TheoryIntermediate

Maximum Likelihood & Generative Models

Maximum Likelihood Estimation (MLE) picks parameters that make the observed data most probable under a chosen probabilistic model.

#maximum likelihood

Filtering by:

#log-sum-exp

#generative models

#naive bayes

+12

📚TheoryIntermediate

Scaled Dot-Product Attention

Scaled dot-product attention scores how much each value V should contribute to a query by taking dot products with keys K, scaling by \(\sqrt{d_k}\), applying softmax, and forming a weighted sum.

#scaled dot-product attention#softmax#transformer+10

📚TheoryIntermediate

Cross-Entropy

Cross-entropy measures how well a proposed distribution Q predicts outcomes actually generated by a true distribution P.

#cross-entropy#entropy#kl divergence+12

📚TheoryIntermediate

ELBO (Evidence Lower Bound)

The Evidence Lower Bound (ELBO) is a tractable lower bound on the log evidence log p(x) that enables learning and inference in latent variable models like VAEs.

#elbo#variational inference#vae+12

📚TheoryIntermediate

KL Divergence (Kullback-Leibler Divergence)

Kullback–Leibler (KL) divergence measures how one probability distribution P devotes probability mass differently from a reference distribution Q.

#kl divergence#kullback-leibler#cross-entropy+12