Implement sensible timeout for long-running tasks
I had to cancel https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/2193 after 13 hours (!). The culprit was rules_nodejs (rbe_ubuntu1604): The first task timed out after 8 hours, the second one had been running for 5.5 hours when I cancelled it.
(Assigned Yun to the wrong issue)
I'm seeing more hanging jobs at https://buildkite.com/bazel/bazelisk-plus-incompatible-flags/builds/1024#_
It looks like the task has been finished, but the job didn't terminate.
/cc @philwo Any idea what could cause this?
I'm tracking the hanging jobs issue in https://github.com/bazelbuild/continuous-integration/issues/1244 👍🏻