Groups
Category
The discounted return G_t sums all future rewards but down-weights distant rewards by powers of a discount factor γ.
Value function approximation replaces a huge table of values with a small set of parameters that can generalize across similar states.