Concepts3
πTheoryIntermediate
ELBO (Evidence Lower Bound)
The Evidence Lower Bound (ELBO) is a tractable lower bound on the log evidence log p(x) that enables learning and inference in latent variable models like VAEs.
#elbo#variational inference#vae+12
πTheoryAdvanced
Policy Gradient Theorem
The policy gradient theorem tells us how to push a stochastic policyβs parameters to increase expected return by following the gradient of expected rewards.
#policy gradient#reinforce#actor-critic+11
πTheoryAdvanced
Measure Theory
Measure theory generalizes length, area, and probability to very flexible spaces while keeping countable additivity intact.
#measure theory#sigma-algebra#lebesgue integral+12