Groups
Category
Multi-task loss balancing aims to automatically set each task’s weight so that no single loss dominates training.
Autoregressive (AR) models represent a joint distribution by multiplying conditional probabilities in a fixed order, using the chain rule of probability.