Groups
Category
Temporal Difference (TD) Learning updates value estimates by bootstrapping from the next state's current estimate, enabling fast, online learning.
DP with probability models how chance flows between states over time by repeatedly redistributing mass according to transition probabilities.