OpenVLA: How a 7B Open-Source Model Beat a 55B Closed-Source One
TL;DR
OpenVLA is an open-source Vision-Language-Action model developed jointly by Stanford and UC Berkeley. Built on Prismatic VLM (Llama 2 7B + DINOv2 + SigLIP), trained on 970k robot demonstrations curated from Open X-Embodiment. Zero-shot success ...
telos-robotics.hashnode.dev7 min read