Sci-CoE is a two-stage training method that helps one language model learn to both solve science problems and check those solutions with very little labeled data.
DARC teaches big language models to get smarter by splitting training into two calm, well-organized steps instead of one chaotic loop.