Papers2

#asynchronous RL

ECHO-2: A Large-Scale Distributed Rollout Framework for Cost-Efficient Reinforcement Learning

ECHO-2 is a new way to train AI with reinforcement learning that keeps a small, central trainer busy while sending the easy, cheap work (rollouts) to many low-cost computers spread around the world.

#ECHO-2#distributed rollouts#bounded staleness

INTELLECT-3: Technical Report

Intermediate

Prime Intellect Team, Mika Senghaas et al.Dec 18arXiv

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (about 12B active per token) trained with large-scale reinforcement learning and it beats many bigger models on math, coding, science, and reasoning tests.

#INTELLECT-3#prime-rl#verifiers