X Square Robot Open-Sources Wall-OSS-0.5, Advances Pretrained VLA Performance — X Square Robot Technology (Shenzhen) Co., Ltd.

Event summary

X Square Robot released Wall-OSS-0.5, a Vision-Language-Action (VLA) model designed for real-world robotic manipulation, on May 28, 2026.
The pretrained model achieved task-progress scores above 80 on multiple tasks, including Block Sorting (100), Fruit Sorting (96), Ring Stacking (86), and Rope Tightening (82).
Wall-OSS-0.5 is trained on a three-source mixture: self-collected manipulation data, curated open-source multi-embodiment trajectories, and a 90M-sample multimodal corpus.
The model introduces gradient-bridged co-training, Vision-Aligned RVQ Action Tokenizer, and Action-Space Supervision for flow matching.
X Square Robot open-sourced the full Wall-OSS-0.5 stack, including model weights, training code, training recipes, and optimizer implementations.

The big picture

X Square Robot's release of Wall-OSS-0.5 represents a significant step forward in the development of pretrained Vision-Language-Action models for robotic manipulation. By open-sourcing the model and its training stack, the company aims to support further research and development toward general-purpose embodied AI. This move aligns with broader industry trends toward more capable and adaptable robotic systems, potentially accelerating advancements in automation and AI-driven physical tasks.

What we're watching

Model Adoption: How quickly the open-source Wall-OSS-0.5 model will be adopted by researchers and developers in the embodied AI community.
Performance Scaling: Whether the model's performance can be further improved through additional pretraining or fine-tuning on specific tasks.
Industry Impact: The pace at which advancements in pretrained VLA models like Wall-OSS-0.5 will drive innovation in real-world robotic applications.