VLA-Zoo · Reports

π0.5 reports

Self-contained reports on the π0.5 vision-language-action model and its ablations on LIBERO. Internal — these pages embed infra paths, SLURM job IDs, and unpublished eval numbers; don't share the URL publicly.

π0.5 — architecture & tensor IO

model card

A dual-expert flow-matching VLA, pinned down: every module's input/output tensor shape for pi05_libero, the camera-input contract, and each GitHub-issue ablation knob mapped to the exact module it touches.

Open →

π0.5 wrist-only — full-FT baselines

report · comparison

The apparent difficulty of wrist-only π0.5 was largely an artifact of the masking mechanism, not a capacity ceiling: physically removing the third-person camera from attention nearly recovers the both-camera ceiling.

physical removal 94.2% zero-mask 27.4% both-cam ceiling 96.6% LoRA floor 5.0%
Open →