pytorch-lightning icon indicating copy to clipboard operation
pytorch-lightning copied to clipboard

[TPU] Install torch_xla in CUDA CI

Open carmocca opened this issue 2 years ago • 3 comments

What does this PR do?

Part of #16130

Sets up XLA testing on CUDA CI and other related pieces to get things working

  • Add connector support for non-TPU
  • Update existing RunIf(tpu=...) tests and convert some to RunIf(xla=...)

XLA support for CUDA requires XLA nightly to work due to https://github.com/pytorch/xla/issues/4988

cc @carmocca @borda @JackCaoG @steventk-g @Liyang90 @justusschock @awaelchli

carmocca avatar Apr 28 '23 15:04 carmocca

⛈️ Required checks status: Has failure 🔴

Warning This job will need to be re-run to merge your PR. If you do not have write access to the repository, you can ask Lightning-AI/lai-frameworks to re-run it. If you push a new commit, all of CI will re-trigger.

Groups summary

🔴 pytorch_lightning: Tests workflow
Check ID Status
pl-cpu (macOS-11, lightning, 3.8, 1.11) success
pl-cpu (macOS-11, lightning, 3.9, 1.12) success
pl-cpu (macOS-11, lightning, 3.10, 1.13) success
pl-cpu (macOS-11, lightning, 3.10, 2.0) success
pl-cpu (macOS-11, lightning, 3.8, 1.11, oldest) success
pl-cpu (ubuntu-20.04, lightning, 3.8, 1.11) success
pl-cpu (ubuntu-20.04, lightning, 3.9, 1.12) success
pl-cpu (ubuntu-20.04, lightning, 3.10, 1.13) success
pl-cpu (ubuntu-20.04, lightning, 3.10, 2.0) failure
pl-cpu (ubuntu-20.04, lightning, 3.8, 1.11, oldest) success
pl-cpu (windows-2022, lightning, 3.8, 1.11) success
pl-cpu (windows-2022, lightning, 3.9, 1.12) success
pl-cpu (windows-2022, lightning, 3.10, 1.13) success
pl-cpu (windows-2022, lightning, 3.10, 2.0) success
pl-cpu (windows-2022, lightning, 3.8, 1.11, oldest) success
pl-cpu (macOS-11, pytorch, 3.8, 1.13) success
pl-cpu (ubuntu-20.04, pytorch, 3.8, 1.13) success
pl-cpu (windows-2022, pytorch, 3.8, 1.13) success

These checks are required after the changes to src/lightning/fabric/accelerators/cuda.py, src/lightning/fabric/accelerators/xla.py, src/lightning/fabric/connector.py, src/lightning/fabric/utilities/testing/_runif.py, requirements/pytorch/strategies.txt, src/lightning/pytorch/trainer/connectors/accelerator_connector.py, src/lightning/pytorch/utilities/testing/_runif.py, tests/tests_pytorch/conftest.py.

🔴 pytorch_lightning: Azure GPU
Check ID Status
pytorch-lightning (GPUs) failure

These checks are required after the changes to .azure/gpu-tests-pytorch.yml, requirements/pytorch/strategies.txt, src/lightning/pytorch/trainer/connectors/accelerator_connector.py, src/lightning/pytorch/utilities/testing/_runif.py, tests/tests_pytorch/conftest.py, src/lightning/fabric/accelerators/cuda.py, src/lightning/fabric/accelerators/xla.py, src/lightning/fabric/connector.py, src/lightning/fabric/utilities/testing/_runif.py.

🟢 pytorch_lightning: Benchmarks
Check ID Status
lightning.Benchmarks success

These checks are required after the changes to requirements/pytorch/strategies.txt, src/lightning/fabric/accelerators/cuda.py, src/lightning/fabric/accelerators/xla.py, src/lightning/fabric/connector.py, src/lightning/fabric/utilities/testing/_runif.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py, src/lightning/pytorch/utilities/testing/_runif.py.

🟢 fabric: Docs
Check ID Status
make-doctest (fabric) success
make-html (fabric) success

These checks are required after the changes to src/lightning/fabric/accelerators/cuda.py, src/lightning/fabric/accelerators/xla.py, src/lightning/fabric/connector.py, src/lightning/fabric/utilities/testing/_runif.py.

🟢 pytorch_lightning: Docs
Check ID Status
make-doctest (pytorch) success
make-html (pytorch) success

These checks are required after the changes to src/lightning/pytorch/trainer/connectors/accelerator_connector.py, src/lightning/pytorch/utilities/testing/_runif.py, requirements/pytorch/strategies.txt.

🟢 pytorch_lightning: Docker
Check ID Status
build-cuda (3.9, 1.11, 11.3.1) success
build-cuda (3.9, 1.12, 11.7.1) success
build-cuda (3.9, 1.13, 12.0.1) success
build-cuda (3.10, 2.0, 12.0.1) success
build-cuda (3.10, 2.0, 11.7.1) success
build-cuda (3.8, 2.0, 11.7.1) success
build-pl (3.9, 1.11, 11.3.1) success
build-pl (3.9, 1.12, 11.7.1) success
build-pl (3.9, 1.13, 12.0.1) success
build-pl (3.10, 2.0, 12.0.1) success

These checks are required after the changes to .github/workflows/ci-dockers.yml, requirements/pytorch/strategies.txt.

🟢 lightning_fabric: CPU workflow
Check ID Status
fabric-cpu (macOS-11, lightning, 3.8, 1.11) success
fabric-cpu (macOS-11, lightning, 3.9, 1.12) success
fabric-cpu (macOS-11, lightning, 3.10, 1.13) success
fabric-cpu (macOS-11, lightning, 3.10, 2.0) success
fabric-cpu (macOS-11, lightning, 3.8, 1.11, oldest) success
fabric-cpu (ubuntu-20.04, lightning, 3.8, 1.11) success
fabric-cpu (ubuntu-20.04, lightning, 3.9, 1.12) success
fabric-cpu (ubuntu-20.04, lightning, 3.10, 1.13) success
fabric-cpu (ubuntu-20.04, lightning, 3.10, 2.0) success
fabric-cpu (ubuntu-20.04, lightning, 3.8, 1.11, oldest) success
fabric-cpu (windows-2022, lightning, 3.8, 1.11) success
fabric-cpu (windows-2022, lightning, 3.9, 1.12) success
fabric-cpu (windows-2022, lightning, 3.10, 1.13) success
fabric-cpu (windows-2022, lightning, 3.10, 2.0) success
fabric-cpu (windows-2022, lightning, 3.8, 1.11, oldest) success
fabric-cpu (macOS-11, fabric, 3.8, 1.13) success
fabric-cpu (ubuntu-20.04, fabric, 3.8, 1.13) success
fabric-cpu (windows-2022, fabric, 3.8, 1.13) success

These checks are required after the changes to src/lightning/fabric/accelerators/cuda.py, src/lightning/fabric/accelerators/xla.py, src/lightning/fabric/connector.py, src/lightning/fabric/utilities/testing/_runif.py, tests/tests_fabric/conftest.py, tests/tests_fabric/strategies/test_dp.py, tests/tests_fabric/strategies/test_single_device.py.

🟢 lightning_fabric: Azure GPU
Check ID Status
lightning-fabric (GPUs) success

These checks are required after the changes to .azure/gpu-tests-fabric.yml, src/lightning/fabric/accelerators/cuda.py, src/lightning/fabric/accelerators/xla.py, src/lightning/fabric/connector.py, src/lightning/fabric/utilities/testing/_runif.py, tests/tests_fabric/conftest.py, tests/tests_fabric/strategies/test_dp.py, tests/tests_fabric/strategies/test_single_device.py.

🟢 mypy
Check ID Status
mypy success

These checks are required after the changes to requirements/pytorch/strategies.txt, src/lightning/fabric/accelerators/cuda.py, src/lightning/fabric/accelerators/xla.py, src/lightning/fabric/connector.py, src/lightning/fabric/utilities/testing/_runif.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py, src/lightning/pytorch/utilities/testing/_runif.py.

🟢 install
Check ID Status
install-pkg (ubuntu-22.04, app, 3.8) success
install-pkg (ubuntu-22.04, app, 3.10) success
install-pkg (ubuntu-22.04, fabric, 3.8) success
install-pkg (ubuntu-22.04, fabric, 3.10) success
install-pkg (ubuntu-22.04, pytorch, 3.8) success
install-pkg (ubuntu-22.04, pytorch, 3.10) success
install-pkg (ubuntu-22.04, lightning, 3.8) success
install-pkg (ubuntu-22.04, lightning, 3.10) success
install-pkg (ubuntu-22.04, notset, 3.8) success
install-pkg (ubuntu-22.04, notset, 3.10) success
install-pkg (macOS-12, app, 3.8) success
install-pkg (macOS-12, app, 3.10) success
install-pkg (macOS-12, fabric, 3.8) success
install-pkg (macOS-12, fabric, 3.10) success
install-pkg (macOS-12, pytorch, 3.8) success
install-pkg (macOS-12, pytorch, 3.10) success
install-pkg (macOS-12, lightning, 3.8) success
install-pkg (macOS-12, lightning, 3.10) success
install-pkg (macOS-12, notset, 3.8) success
install-pkg (macOS-12, notset, 3.10) success
install-pkg (windows-2022, app, 3.8) success
install-pkg (windows-2022, app, 3.10) success
install-pkg (windows-2022, fabric, 3.8) success
install-pkg (windows-2022, fabric, 3.10) success
install-pkg (windows-2022, pytorch, 3.8) success
install-pkg (windows-2022, pytorch, 3.10) success
install-pkg (windows-2022, lightning, 3.8) success
install-pkg (windows-2022, lightning, 3.10) success
install-pkg (windows-2022, notset, 3.8) success
install-pkg (windows-2022, notset, 3.10) success

These checks are required after the changes to src/lightning/fabric/accelerators/cuda.py, src/lightning/fabric/accelerators/xla.py, src/lightning/fabric/connector.py, src/lightning/fabric/utilities/testing/_runif.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py, src/lightning/pytorch/utilities/testing/_runif.py, requirements/pytorch/strategies.txt.


Thank you for your contribution! 💜

Note This comment is automatically generated and updates for 60 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

github-actions[bot] avatar May 05 '23 02:05 github-actions[bot]

This is blocked by https://github.com/pytorch/xla/issues/4988

carmocca avatar May 08 '23 15:05 carmocca

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id Secret Commit Filename
- Generic High Entropy Secret 78fa3afdfbf964c19b4b2d36b91560698aa83178 tests/tests_app/utilities/test_login.py View secret
- Base64 Basic Authentication 78fa3afdfbf964c19b4b2d36b91560698aa83178 tests/tests_app/utilities/test_login.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

gitguardian[bot] avatar Jan 16 '24 09:01 gitguardian[bot]