etl
etl copied to clipboard

Published 20 hours ago •

Reame
Issues

Staging servers have non-deterministic failures

Open larsyencken opened this issue 5 months ago • 2 comments

Problem

Anecdotally, it seems like there's often a random failure in a staging server pipeline, but if you manually retry it on Buildkite it works.

Expected behaviour

We'd hope for things to work the first time

Gathering more information

Could we periodically review recent builds on each branch?
Could we dump all pipelines including failures and retries using the API?
Are there additional steps that should have automatic retries

Sep 12 '24 09:09 larsyencken