agent icon indicating copy to clipboard operation
agent copied to clipboard

Use reverse ordering for post-command hooks

Open jessebye opened this issue 3 years ago • 3 comments

Is your feature request related to a problem? Please describe.

We use a plugin that assumes an AWS role during the pre-command hook. We also use a plugin (https://github.com/gencer/cache-buildkite-plugin) for caching, which reads from an s3 bucket using the agent's instance role credentials.

The pre-command hooks work fine, since we can rely on ordering and have the cache plugin go first. Then the assume role plugin runs and injects AWS_ACCESS_KEY type environment variables, which we use in the command.

The post-command hooks fail though, because the cache plugin goes first, but tries to write to s3 using the assumed role.

Describe the solution you'd like

Use reverse ordering for post-command hooks. This ensures that they can clean up in the proper order and avoid this type of problem.

Describe alternatives you've considered

We've just stopped using the cache plugin for now, but that isn't ideal.

Additional context

n/a

jessebye avatar May 20 '22 19:05 jessebye

Are you describing this as the current behaviour?

  • plugin 1 (cache) pre-command hook
  • plugin 2 (assume-role) pre-command hook
  • plugin 3 (whatever) pre-command hook
  • command
  • plugin 1 (cache) post-command hook
  • plugin 2 (assume-role) post-command hook
  • plugin 3 (whatever) post-command hook

And you're suggesting:

  • plugin 1 (cache) pre-command hook
  • plugin 2 (assume-role) pre-command hook
  • plugin 3 (whatever) pre-command hook
  • command
  • plugin 3 (whatever) pre-command hook
  • plugin 2 (assume-role) post-command hook
  • plugin 1 (cache) post-command hook

… because tear-down should happen in the opposite order to setup?

 - plugin 1 (cache) pre-command hook
 - plugin 2 (assume-role) pre-command hook
 - plugin 3 (whatever) pre-command hook
 - command
-- plugin 1 (cache) post-command hook
-- plugin 2 (assume-role) post-command hook
-- plugin 3 (whatever) post-command hook
+- plugin 3 (whatever) post-command hook
+- plugin 2 (assume-role) post-command hook
+- plugin 1 (cache) post-command hook

That does sound sensible. It's probably also a breaking change for many pipelines 😬 I wonder if there's a way we can introduce it… perhaps as an agent 4.x thing, or as an experiment in the mean time, or some other type of opt-in flag 🤔

pda avatar Jul 05 '22 02:07 pda

Hitting this exact issue right now. The only workaround I can think of is that every plugin that wants to run in a specific scope needs to somehow snapshot that scope during the pre-command hook and do a context shuffle inside its own post-command hook, which feels 😞

franklin-ross avatar Nov 30 '23 01:11 franklin-ross

Here it is. Works OK for my needs: https://github.com/franklin-ross/aws-restore-role-buildkite-plugin

An alternative to a breaking change would be to add a new hook, like post-command-teardown, that runs in reverse order and is specifically designed for this kind of thing. That wouldn't need any major version bump. Then the aws-assume-role plugin could restore the role during teardown and other plugins using the teardown wouldn't get any weird, interleaved roles. I guess that would be a breaking change for the assume role plugin itself, but that seems much easier for everyone to manage, and a much faster solve than waiting for the next agent major version bump.

franklin-ross avatar Dec 15 '23 05:12 franklin-ross