Robots learn better when they think about how things move over time, not by redrawing every pixel of a video.
WoG (World Guidance) teaches a robot to imagine just the right bits of the near future and use those bits to pick better actions.
Vision-Language-Action (VLA) robots are powerful but too big and slow for many real-world devices.
This paper studies Vision–Language–Action (VLA) robots under one fair setup to find which design choices truly matter.
GigaBrain-0.5M* is a robot brain that sees, reads, and acts, and it gets smarter by imagining the future before moving.
RISE lets a robot learn safely and cheaply by practicing in its imagination instead of always in the real world.
Robots often use the same amount of thinking for easy and hard moves, which wastes time on easy steps and isn’t enough for tricky ones.
Green-VLA is a step-by-step training recipe that teaches one model to see, understand language, and move many kinds of robots safely and efficiently.
DynamicVLA is a small and fast robot brain that sees, reads, and acts while things are moving.
IVRA is a simple, training-free add-on that helps robot brains keep the 2D shape of pictures while following language instructions.
Robots often learn a bad habit called the vision shortcut: they guess the task just by looking, and ignore the words you tell them.
TwinBrainVLA is a robot brain with two halves: a frozen generalist that keeps world knowledge safe and a trainable specialist that learns to move precisely.