Groups
Category
RLHF turns human preferences between two model outputs into training signals using a probabilistic model of choice.
Transfer learning theory studies when and why a model trained on a source distribution will work on a different target distribution.