Groups
Category
The discounted return G_t sums all future rewards but down-weights distant rewards by powers of a discount factor γ.
Reinforcement Learning (RL) studies how an agent learns to act in an environment to maximize long-term cumulative reward.