builder
builder copied to clipboard
VS installer exited with code -1 flakily when building Windows binaries
I'm currently seeing quite a number of flaky failures when building Windows binaries in trunk, for example https://github.com/pytorch/pytorch/actions/runs/4744014597/jobs/8424319388
The error is pointing to this step https://github.com/pytorch/builder/blob/main/windows/internal/vs2022_install.ps1#L42 in which vs_installer.exe
is installed. The exact error is VS installer exited with code -1, which should be one of [0, 3010]
. I have already tried to disabled Windows Defender there (https://github.com/pytorch/pytorch/pull/99389) but it doesn't seem to help.
Another minor bug is when vslogs.zip
is copied at https://github.com/pytorch/builder/blob/main/windows/internal/vs2022_install.ps1#L54. The correct path should be C:\Users\${env:USERNAME}\AppData\Local\Temp\vslogs.zip
as the user is now runneruser
instead of circleci
. This hides the above error.
cc @atalman @malfet @Blackhex
VS2022 should be part of AMI, sholdn't it?
It looks like there is a gap here. The installation script used by the AMI https://github.com/pytorch/test-infra/blob/main/aws/ami/windows/scripts/Installers/Install-VS.ps1#L34 looks older and still uses VS2019. Thus it makes sense that VS2022 is installed every time
Note there is a PR that should update the VS on the AMI pending pytorch/test-infra#1175. I haven't touched it for a while but I can revive it if needed.
Also note, that thre might be a bug in collecting the VS logs that would be helpfull for reporting the issue:
The workflow compresses the logs into C:\Users\runneruser\AppData\Local\Temp\vslogs.zip
file but then copy commad fails with:
Copy-Item : Cannot find path 'C:\Users\circleci\AppData\Local\Temp\vslogs.zip' because it does not exist.
To summary my chat with @malfet on the issue:
- Does this issue only happen with VS2022? If yes, could we rollback to use VS2019 for the time being as it matches with what is currently in the AMI?
- Eventually we can use VS2022, but it would need to be part of the AMI (https://github.com/pytorch/test-infra/pull/1175). cc @atalman I remember that you are testing a new Windows AMI, is this possible to include this change too?