Concepts2

Groups

Surrogate Loss Theory

0-1 loss directly measures classification error but is discontinuous and non-convex, making optimization computationally hard.

#surrogate loss#0-1 loss#hinge loss+12

⚙️AlgorithmIntermediate

PPO & Trust Region Methods

Proximal Policy Optimization (PPO) stabilizes policy gradient learning by preventing each update from moving the policy too far from the previous one.

#ppo