This paper shows that making short videos can help AI plan and reason in pictures better than writing out steps in text.
LEO-RobotAgent is a simple but powerful framework that lets a language model think, plan, and operate many kinds of robots using natural language.