cmd/cue: CreateFile "Access is denied" errors on Windows with modcache partial files
This CI flake has shown up three times in the past few weeks, so it's time we track it and try to fix it.
--- FAIL: TestScript (1.42s)
--- FAIL: TestScript/registry_submodule (0.10s)
testscript.go:585: # This test tests that submodules work OK. (0.070s)
> exec cue eval .
[stderr]
main.org@v0: import failed: cannot find package "example.com/foo@v0": cannot fetch [email protected]: CreateFile $WORK\.tmp\cache\mod\download\example.com\@v\v0.0.1.partial: Access is denied.:
./main.cue:4:2
[exit status 1]
FAIL: testdata\script\registry_submodule.txtar:3: unexpected command failure
FAIL
FAIL cuelang.org/go/cmd/cue/cmd 26.132s
cc @rogpeppe
https://github.com/golang/go/issues/58697 looks extremely similar.
Note that our modcache package was imported from Go fairly recently, so it should already be pretty robust.
Happened again today on https://review.gerrithub.io/c/cue-lang/cue/+/1214081.
Just happened again on a master commit: https://github.com/cue-lang/cue/actions/runs/17579000657/job/49930787837
The error seems to come from os.WriteFile in our modcache package:
// Extract the module zip directory at its final location.
//
// To prevent other processes from reading the directory if we crash,
// create a .partial file before extracting the directory, and delete
// the .partial file afterward (all while holding the lock).
//
// A technique used previously was to extract to a temporary directory with a random name
// then rename it into place with os.Rename. On Windows, this can fail with
// ERROR_ACCESS_DENIED when another process (usually an anti-virus scanner)
// opened files in the temporary directory.
if err := os.MkdirAll(parentDir, 0777); err != nil {
return module.SourceLoc{}, err
}
if err := os.WriteFile(partialPath, nil, 0666); err != nil {
return module.SourceLoc{}, err
}
if err := modzip.Unzip(dir, mv, zipfile); err != nil {
My initial guess was that we should be using robustio for this file creation, because it knows what OS errors are ephemeral, like ERROR_ACCESS_DENIED on Windows. However, upstream Go seems to not do this; their modfetch package has:
// Extract the module zip directory at its final location.
//
// To prevent other processes from reading the directory if we crash,
// create a .partial file before extracting the directory, and delete
// the .partial file afterward (all while holding the lock).
//
// Before Go 1.16, we extracted to a temporary directory with a random name
// then renamed it into place with os.Rename. On Windows, this failed with
// ERROR_ACCESS_DENIED when another process (usually an anti-virus scanner)
// opened files in the temporary directory.
//
// Go 1.14.2 and higher respect .partial files. Older versions may use
// partially extracted directories. 'go mod verify' can detect this,
// and 'go clean -modcache' can fix it.
if err := os.MkdirAll(parentDir, 0777); err != nil {
return "", err
}
if err := os.WriteFile(partialPath, nil, 0666); err != nil {
return "", err
}
if err := modzip.Unzip(dir, mod, zipfile); err != nil {
Interestingly, upstream Go doesn't seem to see this error very often. I could not find an open issue matching "partial" and "access denied" that has shown a failure in the last year, yet we have seen three failures in the last year already.
My intuition was to replace os.WriteFile with some sort of "robust" API we copied from Go. However, renameio.WriteFile was deleted as it was not safe and it did not really solve any problem completely. Similarly, they stopped exposing an IsEphemeralError, which seems to suggest we shouldn't be retrying here.
It's still unclear to me why we're the only ones running into this issue, or how to resolve it.
Worth noting that Namespace said that they disable the Windows built-in anti-virus for the C: drive, which is where testscript places temporary directories, including the module cache causing the error above.