vision-language-action-model topic
X-VLA
The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"
Motus
Official code of Motus: A Unified Latent Action World Model
Efficient-VLAs-Survey
🔥This is a curated list of "A survey on Efficient Vision-Language Action Models" research. We will continue to maintain and update the repository, so follow us to keep up with the latest developments...
TongUI-agent
[AAAI 2026]Release of code, datasets and model for our work TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents
INT-ACT
Official repo for From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models