continuous-integration
continuous-integration copied to clipboard
Bazel CI is flaky due to cloning buildkite docker plugin
Bazel Auto Sheriff often reports flaky jobs failing like:
$ git clone -v -- https://github.com/buildkite-plugins/docker-buildkite-plugin .
Cloning into '.'...
fatal: unable to access 'https://github.com/buildkite-plugins/docker-buildkite-plugin/': Failed to connect to github.com port 443: Connection timed out
https://buildkite.com/bazel/bazel-auto-sheriff-face-with-cowboy-hat/builds/90
Is there a way we could cache this repo to avoid cloning from github every time?
This is also the cause for https://github.com/bazelbuild/continuous-integration/issues/963, because the flaky error will prevent updating last green commit and a rerun on the failed job won't trigger the updating step again.
/cc @philwo
I'm running into this on BazelCI for an internal CL: https://buildkite.com/bazel/google-bazel-presubmit/builds/36883.
Yeah, our recent CI nightly failed only due to this error: https://buildkite.com/bazel/bazel-auto-sheriff-face-with-cowboy-hat/builds/216
/cc @philwo Do you have any idea on this?
Hmpf, I agree this is really annoying. Let's add retry support for cloning plugins to the Buildkite Agent. I'll file an issue.