Green-VLA is a step-by-step training recipe that teaches one model to see, understand language, and move many kinds of robots safely and efficiently.
DynamicVLA is a small and fast robot brain that sees, reads, and acts while things are moving.
IVRA is a simple, training-free add-on that helps robot brains keep the 2D shape of pictures while following language instructions.
Robots often learn a bad habit called the vision shortcut: they guess the task just by looking, and ignore the words you tell them.
TwinBrainVLA is a robot brain with two halves: a frozen generalist that keeps world knowledge safe and a trainable specialist that learns to move precisely.
Robots usually think in words and pictures, but their hands need exact motions, so there is a gap between understanding and doing.
Fast-ThinkAct teaches a robot to plan with a few tiny hidden "thought tokens" instead of long paragraphs, making it much faster while staying smart.
VLingNav is a robot navigation system that sees, reads instructions, and acts, while deciding when to think hard and when to just move.
Traditional self-driving used separate boxes for seeing, thinking, and acting, but tiny mistakes in early boxes could snowball into big problems later.
Robots often see the world as flat pictures but must move in a 3D world, which makes accurate actions hard.
DrivePI is a single, small (0.5B) multimodal language model that sees with cameras and LiDAR, talks in natural language, and plans driving actions all at once.
Vision-Language-Action (VLA) models are robots’ “see–think–do” brains that connect cameras (vision), words (language), and motors (action).