etl icon indicating copy to clipboard operation
etl copied to clipboard

Staging servers have non-deterministic failures

Open larsyencken opened this issue 5 months ago • 2 comments

Problem

Anecdotally, it seems like there's often a random failure in a staging server pipeline, but if you manually retry it on Buildkite it works.

Expected behaviour

We'd hope for things to work the first time

Gathering more information

  • Could we periodically review recent builds on each branch?
  • Could we dump all pipelines including failures and retries using the API?
  • Are there additional steps that should have automatic retries

larsyencken avatar Sep 12 '24 09:09 larsyencken