Pytorch cuda in registry nightly images
Are you testing if the nightly image is usable with cuda?
torch.cuda._is_compiled() is false inside the last nightly image
We should add docker images to validation framework.
Workflow: https://github.com/pytorch/pytorch/blob/main/.github/workflows/docker-release.yml
Docker containers are located here: https://github.com/orgs/pytorch/packages/container/package/pytorch
Simple install command:
docker pull ghcr.io/pytorch/pytorch:2.2.1-cuda11.8-cudnn8-devel
Build workflow: https://github.com/pytorch/pytorch/actions/runs/8200189724/job/22426518545
We should add automation around validation of docker images for both nightly and releases. Release workflow: https://github.com/pytorch/pytorch/actions/runs/8393526521/job/22988732918
Onboard to validation framework: https://github.com/pytorch/builder/actions/workflows/validate-binaries.yml
cc @juliagmt-google
Thanks for sharing the task and details. Here are my questions:
torch.cuda._is_compiled() is false inside the last nightly image: where can I see the output?- What exactly is the validation?
- I saw
docker pull ghcr.io/pytorch/pytorch:2.2.2-cuda11.8-cudnn8-develin https://github.com/orgs/pytorch/packages/container/package/pytorch where Docker containers are located, but the instruction says installingdocker pull ghcr.io/pytorch/pytorch:2.2.1-cuda11.8-cudnn8-devel, which has a different PyTorch version. Which command should I use? - Which files do we need to change to add automation and validation?
Link to the images to validate: https://github.com/orgs/pytorch/packages/container/package/pytorch-nightly Nova workflows for reference: https://github.com/pytorch/test-infra/wiki/Using-Nova-Reusable-Build-Workflows
Try to call: https://github.com/pytorch/test-infra/blob/main/.github/workflows/linux_job.yml
For gpu runners we need to use pytorch/test-infra/.github/workflows/linux_job.yml@main
It would be nice to check in the CI job the layers invalidation as currently tracking nightly day by day is going to fill quite soon artifact registry space and local build cache. See https://github.com/pytorch/pytorch/issues/125862