🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers5

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#LIBERO benchmark

VLANeXt: Recipes for Building Strong VLA Models

Intermediate
Xiao-Ming Wu, Bin Fan et al.Feb 20arXiv

This paper studies Vision–Language–Action (VLA) robots under one fair setup to find which design choices truly matter.

#Vision-Language-Action#robot manipulation#flow matching

Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning

Intermediate
Yalcin Tur, Jalal Naghiyev et al.Feb 8arXiv

Robots often use the same amount of thinking for easy and hard moves, which wastes time on easy steps and isn’t enough for tricky ones.

#Recurrent depth#Latent iterative reasoning#Vision-Language-Action

IVRA: Improving Visual-Token Relations for Robot Action Policy with Training-Free Hint-Based Guidance

Beginner
Jongwoo Park, Kanchana Ranasinghe et al.Jan 22arXiv

IVRA is a simple, training-free add-on that helps robot brains keep the 2D shape of pictures while following language instructions.

#Vision-Language-Action#affinity map#training-free guidance

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

Intermediate
Zechen Bai, Chen Gao et al.Dec 16arXiv

Robots usually learn by copying many demonstrations, which is expensive and makes them brittle when things change.

#EVOLVE-VLA#test-time training#vision-language-action

Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos

Intermediate
Yicheng Feng, Wanpeng Zhang et al.Dec 15arXiv

Robots often see the world as flat pictures but must move in a 3D world, which makes accurate actions hard.

#Vision-Language-Action#3D spatial grounding#visual-physical alignment