Groups
The policy gradient theorem tells us how to push a stochastic policyโs parameters to increase expected return by following the gradient of expected rewards.
Measure theory generalizes length, area, and probability to very flexible spaces while keeping countable additivity intact.