Robots used to explore by following simple rules or short-term rewards, which often made them waste time and backtrack a lot.
RelayLLM lets a small model do the talking and only asks a big model for help on a few, truly hard tokens.