🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Diffusion Policy

RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation

Intermediate
Boyang Wang, Haoran Zhang et al.Jan 8arXiv

RoboVIP is a plug-and-play tool that turns ordinary robot videos into many new, realistic, multi-view training videos without changing the original robot actions.

#robotic manipulation#video diffusion#multi-view generation

An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges

Intermediate
Chao Xu, Suyu Zhang et al.Dec 12arXiv

Vision-Language-Action (VLA) models are robots’ “see–think–do” brains that connect cameras (vision), words (language), and motors (action).

#Vision-Language-Action#Embodied AI#Multimodal Alignment