continuous-integration icon indicating copy to clipboard operation
continuous-integration copied to clipboard

Bazel CI is flaky due to cloning buildkite docker plugin

Open meteorcloudy opened this issue 4 years ago • 3 comments

Bazel Auto Sheriff often reports flaky jobs failing like:

$ git clone -v -- https://github.com/buildkite-plugins/docker-buildkite-plugin .
Cloning into '.'...
fatal: unable to access 'https://github.com/buildkite-plugins/docker-buildkite-plugin/': Failed to connect to github.com port 443: Connection timed out

https://buildkite.com/bazel/bazel-auto-sheriff-face-with-cowboy-hat/builds/90

Is there a way we could cache this repo to avoid cloning from github every time?

This is also the cause for https://github.com/bazelbuild/continuous-integration/issues/963, because the flaky error will prevent updating last green commit and a rerun on the failed job won't trigger the updating step again.

/cc @philwo

meteorcloudy avatar Mar 16 '20 08:03 meteorcloudy

I'm running into this on BazelCI for an internal CL: https://buildkite.com/bazel/google-bazel-presubmit/builds/36883.

haxorz avatar Jul 17 '20 00:07 haxorz

Yeah, our recent CI nightly failed only due to this error: https://buildkite.com/bazel/bazel-auto-sheriff-face-with-cowboy-hat/builds/216

/cc @philwo Do you have any idea on this?

meteorcloudy avatar Jul 17 '20 10:07 meteorcloudy

Hmpf, I agree this is really annoying. Let's add retry support for cloning plugins to the Buildkite Agent. I'll file an issue.

philwo avatar Jul 17 '20 10:07 philwo