Concepts356

Groups

Value Function Approximation

Value function approximation replaces a huge table of values with a small set of parameters that can generalize across similar states.

#reinforcement learning#value function approximation#linear function approximator+12

⚙️AlgorithmIntermediate

PPO & Trust Region Methods

Proximal Policy Optimization (PPO) stabilizes policy gradient learning by preventing each update from moving the policy too far from the previous one.

2 3 4 5 6

Concepts356

Value Function Approximation

PPO & Trust Region Methods

Temporal Difference Learning

Markov Decision Processes (MDP)

t-SNE & UMAP

Principal Component Analysis (PCA)

Metric Learning

Self-Supervised Learning Theory

Contrastive Learning

Embedding Spaces & Distributed Representations

Classifier-Free Guidance

Flow Matching