feedback icon indicating copy to clipboard operation
feedback copied to clipboard

True unconditional cleanup step possible?

Open mallyvai opened this issue 6 years ago • 16 comments

We're using buildkite to help with some integration testing by doing a tiny bit of AWS orchestration. We need to clean up these resources when the pipeline is done regardless of whether or not the test have succeeded. There are multiple wait steps in the buildkite pipeline.

I found this older issue: https://github.com/buildkite/feedback/issues/133

Which seems to imply there's no way to reliably do it today if there are multiple wait steps.

Is this true? Would be great to get some input on an alternate pattern if so!

mallyvai avatar Apr 12 '18 04:04 mallyvai

I was thinking about triggering a separate pipeline but it's still like any other step in the respect that it will end up not being triggered if there are failures further up, correct?

mallyvai avatar Apr 12 '18 17:04 mallyvai

We've been using continue_on_failure: true and it's worked fine here with multiple wait steps.

steps:
  - name: "Build"
    command: "echo build"

  - wait

  - name: "Test"
    command: "echo test"

  - wait: ~
    continue_on_failure: true

  - name: "Cleanup"
    command: "echo cleanup"

Even if the build or test steps fail, the cleanup step always runs.

One edge case we just hit was when someone cancels the build. That cancels all subsequent jobs, including the clean up one. Maybe a continue_on_cancel option would be desirable (or including cancellation in the list of failure modes).

gtirloni avatar Apr 12 '18 19:04 gtirloni

@gtirloni weird, that doesn't work for me, I tried with this configuration and the cleanup step was not called.

haines avatar Apr 25 '18 08:04 haines

@haines in case it helps, this is the pipeline.yml we're using in one of our open source projects: https://github.com/fluid-project/infusion/blob/master/.buildkite/pipeline.yml

gtirloni avatar Apr 25 '18 12:04 gtirloni

@haines I think I understand what you mean. "continue_on_failure" will only continue if the immediate previous stage fails, but if you have multiple stages between the one that failed and the last, then it doesn't. Yeah, a true unconditional continue on error catch-all would be great for us too.

gtirloni avatar Jun 12 '18 21:06 gtirloni

Yeah, that would be great! Jenkinsfile has this structure, and it's great for cleanup

I ended up doing it by using the build.finished webhook and then invoking back a "cleanup" pipeline using Buildkite API.

jmendiara avatar Jul 12 '18 16:07 jmendiara

I've seen quite a few issues on buildkite on this line, most have been closed, but this one addresses specifically what we need. Here's our situation:

For each build, we bring up a docker container that is reused (for caching build artifacts) across each step. We have multiple wait: {continue_on_failure: true} blocks in the pipeline, ensuring that we eventually get to the last step where we bring down and delete the container.

We cannot use the docker plugin for a few reasons (we have to run a bunch of commands on bringup of the container and this is pretty heavy). Nor can we use the pre-exit hook which runs after every build step, but we want to specifically run our cleanup after the whole build.

Are there any features we're missing (or are coming) that would allow this? Either a way to mark a build step as "run this at the end, even when cancelling" or a way to create a separate pipeline or single command that is run at the end of all builds, even cancelled builds.

Thank you!

fahhem avatar Jul 20 '18 04:07 fahhem

Cleanup steps should also consider cancelled jobs.

  - command: maybe_fails.sh

  - wait: ~
    continue_on_failure: true

  - command: cleanup.sh

cleanup.sh is only executed after maybe_fails.sh, but a build can be cancelled by other means (user action on the UI, another push...) and we need to cleanup unconditionally

jmendiara avatar Jul 30 '18 10:07 jmendiara

Wanted to bump this ticket. We would love to expand our buildkite usage to subsume more of Jenkins usage but until we get something closer to an unconditional cleanup step I'm afraid that would be difficult...

mallyvai avatar Aug 07 '18 20:08 mallyvai

@mallyvai Try the 'pre-exit' repository hook:https://buildkite.com/docs/agent/v3/hooks

byrnedo avatar Nov 14 '18 13:11 byrnedo

@byrnedo That's before each job (aka step) finishes, not when the pipeline finishes (or is canceled). Also, the pre-exit hook doesn't know if the current step is the last one, or even if it's being canceled or else it could be used for this issue's purpose.

fahhem avatar Nov 15 '18 23:11 fahhem

Bumping this (sorry for the noise). It's been almost 2 years since this issue was opened. It would be great it someone from the buildkite team could chime in, even if it is to confirm that this won't be implemented.

edrevo avatar Jan 10 '20 08:01 edrevo

Having a - finally section would be incredibly useful for us at ButterflyNetwork as well. As a workaround, we are considering having multiple passes of test results collection using depends_on within the same stage. We also use dynamic pipeline generation, meaning we need to filter depends_on for existing steps. The multi-pass approach will waste resources, but it seems to be the only option for now.

ccarpita avatar Feb 13 '20 22:02 ccarpita

+1 for this feature

As is I created a workaround by setting buildkite-agent meta-data for each passing step, and checking it in subsequent steps to print a message and fail early. Then I make all the wait steps continue on failure and it seems to do what I want.

aganders3 avatar Mar 25 '20 18:03 aganders3

- finally would be great

cyn110 avatar Sep 07 '22 20:09 cyn110

+1 for this feature It is critical for pipelines which want to reset env no matter what happens before.

speakless86 avatar Mar 13 '23 19:03 speakless86