Groups
Category
Proximal Policy Optimization (PPO) stabilizes policy gradient learning by preventing each update from moving the policy too far from the previous one.
t-SNE and UMAP are nonlinear dimensionality-reduction methods that preserve local neighborhoods to make high-dimensional data visible in 2D or 3D.