X Square Robot Open-Sources Wall-OSS-0.5, Advances Pretrained VLA Performance
Event summary
- X Square Robot released Wall-OSS-0.5, a Vision-Language-Action (VLA) model designed for real-world robotic manipulation, on May 28, 2026.
- The pretrained model achieved task-progress scores above 80 on multiple tasks, including Block Sorting (100), Fruit Sorting (96), Ring Stacking (86), and Rope Tightening (82).
- Wall-OSS-0.5 is trained on a three-source mixture: self-collected manipulation data, curated open-source multi-embodiment trajectories, and a 90M-sample multimodal corpus.
- The model introduces gradient-bridged co-training, Vision-Aligned RVQ Action Tokenizer, and Action-Space Supervision for flow matching.
- X Square Robot open-sourced the full Wall-OSS-0.5 stack, including model weights, training code, training recipes, and optimizer implementations.
The big picture
X Square Robot's release of Wall-OSS-0.5 represents a significant step forward in the development of pretrained Vision-Language-Action models for robotic manipulation. By open-sourcing the model and its training stack, the company aims to support further research and development toward general-purpose embodied AI. This move aligns with broader industry trends toward more capable and adaptable robotic systems, potentially accelerating advancements in automation and AI-driven physical tasks.
What we're watching
- Model Adoption
- How quickly the open-source Wall-OSS-0.5 model will be adopted by researchers and developers in the embodied AI community.
- Performance Scaling
- Whether the model's performance can be further improved through additional pretraining or fine-tuning on specific tasks.
- Industry Impact
- The pace at which advancements in pretrained VLA models like Wall-OSS-0.5 will drive innovation in real-world robotic applications.
Related topics
