continuous-integration
continuous-integration copied to clipboard
Permission denied on Windows tests (flaky)
I see lots of Windows failures on this pipeline: https://buildkite.com/bazel/bazelisk-plus-incompatible-flags/builds/579
For example, I see 7 failures for the flag incompatible_string_replace_count, all of them seem flaky.
Examples of error messages:
(03:05:57) ERROR: C:/b/bk-windows-drwz/bazel-downstream-projects/buildtools/buildifier2/BUILD.bazel:3:11: no such package '@skylark_syntax//syntax': no such package '@bazel_gazelle_go_repository_cache//': C:/b/pmtr6clt/external/bazel_gazelle_go_repository_cache/pkg/mod/github.com/chzyer/[email protected]/doc (Permission denied) and referenced by '//buildifier2:go_default_library'
(03:13:11) ERROR: C:/b/hhtk5fpz/external/bazel_tools/tools/jdk/BUILD:383:6: @bazel_tools//tools/jdk:remote_toolchain depends on @remote_java_tools_windows//:toolchain in repository @remote_java_tools_windows which failed to fetch. no such package '@remote_java_tools_windows//': C:/b/hhtk5fpz/external/remote_java_tools_windows/java_tools/JavaBuilder_deploy.jar (Permission denied)
(03:05:33) ERROR: C:/b/bk-windows-601l/bazel-downstream-projects/bazelisk/BUILD:41:11: no such package '@com_github_mitchellh_go_homedir//': no such package '@bazel_gazelle_go_repository_cache//': C:/b/5t7h6vom/external/bazel_gazelle_go_repository_cache/pkg/mod/github.com/hashicorp/[email protected] (Permission denied) and referenced by '//:go_default_library'
I can reproduce this issue locally, will look into it soon!
It looks like we are hitting something similar to https://github.com/bazelbuild/bazel/issues/7458
I haven't figured out exactly why, but enabling symlink support via --windows_enable_symlink seems to fix the problem.
OK, the reason of this failure is because the Java worker will hold the file handle of C:\src\tmp\cbtx3svz\external\remote_java_tools_windows\java_tools\JavaBuilder_deploy.jar.
When we change an incompatible flag that will cause Bazel to refetch the java tool (eg. --incompatible_disable_depset_items), Bazel will fail the clean up the directory of C:\src\tmp\cbtx3svz\external\remote_java_tools_windows, because JavaBuilder_deploy.jar is still open.
To workaround this issue on CI, we can use Bazelisk's BAZELISK_SHUTDOWN feature, which will shut down bazel between builds and ensure the file handles are released by the Java workers.
It still fails: https://buildkite.com/bazel/bazelisk-plus-incompatible-flags/builds/592#83a2bd3b-a04d-4dea-8aa1-3812ec39c251
(03:03:01) ERROR: C:/b/bk-windows-drwz/bazel-downstream-projects/bazel-skylib/gazelle/BUILD:6:11: no such package '@com_github_bazelbuild_buildtools//build': no such package '@bazel_gazelle_go_repository_cache//': C:/b/zbugc6cn/external/bazel_gazelle_go_repository_cache/pkg/mod/github.com/!burnt!sushi/[email protected]/cmd/toml-test-decoder (Permission denied) and referenced by '//gazelle:go_default_library'
Can we reopen this bug? Or I can file a new one.
Oh sorry, we can re-open this. Looks like my change only fixed the Bazel Federation case, but not for this one.
The bug is on Bazel side. Can you please help review the fix https://github.com/bazelbuild/bazel/pull/11982?
Let's keep this open until we are sure the problem is fixed. We'll have to wait for the next month's Bazel release.
FYI, I'm hitting this in Stardoc today with Bazel 6.3.2 as well as 7.0 pre: https://github.com/bazelbuild/stardoc/pull/179 -> https://buildkite.com/bazel/stardoc/builds/1081