xla
xla copied to clipboard
fix torchvision installation in tpu ci test setup script
Currently in TPU CI, we are installing torch nightly whl before running tests, however, in TPU CI, torch_xla is built against PyTorch HEAD. This would cause some compatibility issues. (e.g. Symbol mismatching in .so)
-
torchaudio is not needed for CI, remove torchaudio installation in the setup script as well.
-
Instead of installing nightly torchvision, use the torchvision version pinned in PyTorch src folder.
Cannot find a way to read the pytorch/.github/ci_commit_pins
in test/tpu/xla_test_job.yaml
.
When doing pip install "git+https://github.com/pytorch/vision.git@$TORCHVISION_COMMIT"
, $TORCHVISION_COMMIT
will be an empty str.
To rule out the fact that we cannot change env var in the .yaml
config files, I put the pip
command in a shell script but looks like the path src/pytorch/.github/ci_commit_pins
is not valid.
Can you also mirror the fix to the GHA CI if it works? https://github.com/pytorch/xla/blob/master/.github/workflows/tpu_ci.yml
The commit pin will be more readily available.
I had a few attempts in the previous commits fixing the GHA TPU CI, but failed to do so, let's land the fix for installing torchvision properly for the old TPU CI.
https://github.com/pytorch/xla/blob/master/.github/workflows/tpu_ci.yml
GHA TPU CI fixed in https://github.com/pytorch/xla/pull/6730