Learn Hard Problems During RL with Reference Guided Fine-tuning
IntermediateYangzhen Wu, Shanda Li et al.Mar 1arXiv
ReGFT is a simple pre-RL step that shows the model partial human hints, then makes it solve problems in its own words, creating correct, model-style solutions for hard questions.
#Reference-Guided Fine-Tuning#ReGFT#ReFT