agent Support plugin SHA-1 digests

This adds the ability to specify a plugin git commit SHA-1 (ala git show-ref / git rev-parse), preventing any modification to the plugin after it has been added to the pipeline.yml file:

steps:
  - command: |
      node -e 'console.log("Hello world")'
    plugins:
      docker#v3.2.0@git-sha1:f199bc812a7d7c770139fe8f67203010e103d0f8:
        image: node

If the digest matches then this is the output, and everything continues hunky dory:

If the SHA-1 of the plugin checkout doesn't match the SHA-1 specified in the pipeline.yml, you get

Some design notes:

The intention is that this is the new default for how people specify plugins in their pipeline.yml files. Digests should be baked into the default usage for everyone, helping increase the security of the entire ecosystem, just like it is bundler/npm/yarn/etc. Rather than something that can be specified for those who are worried.
I can easily add support for the commit SHA updating into https://github.com/toolmantim/boomper, which we're using to help keep the plugin readme version numbers up-to-date. This means we retain the same API for usage (copying+pasting from readme markdown file examples), and no extra maintenance burden for plugin authoring.
The git-sha1 syntax was chosen over just specifying the SHA1, so we're explicitly stating what/where the SHA1 comes from. i.e. docker#v3.2.0@git-sha1:f199bc812a7d7c770139fe8f67203010e103d0f8 is assumed to be clearer, and more self-documenting, than docker#v3.2.0@f199bc812a7d7c770139fe8f67203010e103d0f8
There's nothing to change on the backend side, it just passes the plugin string all the way back through to the agent to handle. I've given it a test, and works well.
Running this new digest plugin step on an older agent will fail (e.g. error: pathspec 'v3.2.0@git-sha1:f199fc812a7d7c770139fe8f67203010e103d0f8' did not match any file(s) known to git.). I believe this behaviour is okay (even desired) because if you're specifying digests then you want to make sure they're being enforced and not just ignored. (We can auto-detect the error in the build output, same as we do SSH key problems).
Providing a digest here should avoid the need to vendor plugins for the sake of ensuring they're not tampered with (e.g. a tag is force-pushed, or the repo taken over)

[x] Works on backend (no changes needed)
[x] Initial implementation
[x] Feedback on overall design
[x] Error on incorrect digest format (e.g. docker#v3.2.0@invalid:f199bc812a7d7c770139fe8f67203010e103d0f8:). At the moment invalidly formatted digests are silently ignored, which is bad because they might appear correct to the eye but are silently ignored.
[ ] Tests
[ ] Code review

Jun 09 '19 14:06 toolmantim

This is looking awesome!

Jun 10 '19 01:06 lox

One concern I've thought of, is that if the readmes of the plugins are updated to use this new syntax, people might copy+paste them and try to run it on an older agents that don't support it. In this case, they'll get the error pathspec 'v3.2.0@git-sha1:f199fc812a7d7c770139fe8f67203010e103d0f8' did not match any file(s) known to git.. We can auto-detect this error message in the build output in the web UI, and show a useful helpful message with links to the docs, same as we do for ssh key permissions problems.

We should also ensure we've released a new Elastic Stack with support, in advance of changing the readmes.

Jun 10 '19 09:06 toolmantim

@toolmantim could we check for the agent version on the server side and trim off the digest for agents that don't support it? We do pre-processing of the plugins currently.

We could show an error on the build to indicate that the agent didn't support the digest pinning.

Jun 11 '19 08:06 lox

Yeah we could, I thought of that too. I reckon you'd want the job to fail still though yeah? So either you fail the job on the backend with some fake build log output somehow or extra UI logic, or you just let it fail on the old agent but also show a nicer error message in the web UI like we do with SSH problems.

Jun 11 '19 08:06 toolmantim

or you just let it fail on the old agent but also show a nicer error message in the web UI like we do with SSH problems.

I reckon that is the way to go!

Jun 11 '19 08:06 lox

Do we want to allow partial sha1 matches? E.g let folks provide 7-10 digits of the sha1 prefix?

Jun 11 '19 08:06 lox

Do we want to allow partial sha1 matches? E.g let folks provide 7-10 digits of the sha1 prefix?

For the sake of security, I was thinking not. I wasn't sure what guarantees you get with the truncated string. Any idea?

I just did a quick search on the crypto guarantees Git gives you, and found this disheartening advice: https://en.wikipedia.org/wiki/SHA-1#Data_integrity

Which links to: https://marc.info/?l=git&m=115678778717621&w=2

But, it looks like Git is transitioning to SHA256 since that, and that the SHA-1 implementation is hardened against SHAttered: https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt

Not sure when GitHub is planning to support it:

$ curl https://api.github.com/repos/buildkite-plugins/docker-compose-buildkite-plugin/git/refs/tags/v3.0.3
{
  "ref": "refs/tags/v3.0.3",
  "node_id": "MDM6UmVmNDUwMTc0NDA6djMuMC4z",
  "url": "https://api.github.com/repos/buildkite-plugins/docker-compose-buildkite-plugin/git/refs/tags/v3.0.3",
  "object": {
    "sha": "f7f2d362fe37c51f7f42f485385314b412416e17",
    "type": "commit",
    "url": "https://api.github.com/repos/buildkite-plugins/docker-compose-buildkite-plugin/git/commits/f7f2d362fe37c51f7f42f485385314b412416e17"
  }
}

We could go with SHA-1 for now, and support a git-sha256:xxx digest in the future when Git and/or GitHub supports it? Not sure if this meets people's needs and expectations? The alternative would be finding another way to specify and calculate the digests.

Jun 11 '19 10:06 toolmantim

I added a few things, but I couldn't quite get the new --require-plugin-digests to work — it seems nil/false in the bootstrap 🤔

Would love a hand getting this finished up @lox!

Jun 15 '19 13:06 toolmantim

I'll help get it across the line this week @toolmantim!

Jun 15 '19 23:06 lox

@toolmantim interesting thinking on "--require*" I was imagining that we'd go for --no-plugins --allow-plugins-with-digests --allow-vendored-plugins.

Thoughts on which direction you prefer?

Jun 15 '19 23:06 lox

How would you only allow vendored plugins with the --require* syntax I wonder. It seems to not lend itself very well to allowing multiple different types of "secure" plugin methods.

Jun 15 '19 23:06 lox

The allow syntax also opens up things like --no-plugins --allow-plugins-with-org=buildkite --allow-plugins-with-digests --allow-vendored-plugins.

Jun 16 '19 00:06 lox

Although that is ambiguous as to whether digest plugins are required to be buildkite org 🤔

Jun 16 '19 00:06 lox

--plugin-condition "(plugin.organization == 'buildkite' && plugin.digest != nil) || plugin.vendored"? 😅

Jun 16 '19 00:06 lox

(We could have that very easily)

Jun 16 '19 00:06 lox

Shall we merge this without the flag @toolmantim and look at something like https://github.com/buildkite/agent/pull/1034 next?

Jun 16 '19 04:06 lox

Shall we merge this without the flag @toolmantim and look at something like #1034 next?

Yeah, you're totally right! We should just merge digest support first.

Jun 16 '19 10:06 toolmantim

--no-plugins --allow-plugins-with-org=buildkite --allow-plugins-with-digests --allow-vendored-plugins

I really like this one I think! 👌🏼

Jun 16 '19 10:06 toolmantim

--no-plugins --allow-plugins-with-org=buildkite --allow-plugins-with-digests --allow-vendored-plugins I really like this one I think! 👌🏼

Actually, I was hoping that --require-plugin-digests would become the new default in a future version of the agent, with a warning perhaps for non-vendored plugins without digests. It's sort of separate to the idea of "allowing".

So if we wanted to recommend a new secured agent setup, it might not use --no-plugins at all, but perhaps:

--require-plugin-digests --allow-plugin-host=https://github.com/my-org
--require-plugin-digests --allow-plugin=https://github.com/my-org/my-plugin
--require-plugin-digests --allow-plugin-host=https://github.com/my-org --allow-plugin=https://github.com/other-org/other-plugin

And in future agent versions we could drop the --allow-plugin-hosts part altogether, so a common secure agent setup might be just --allow-plugin-hosts=https://github.com/my-org — which would require plugin digests, only allow your own orgs plugins, and also allow vendored plugins.

Jun 16 '19 11:06 toolmantim

I reckon it’s going to get too complicated to do with flags @toolmamtim. I’m still also a bit unsure about whether requiring everyone use digests should be a thing, but keen to see how it plays out.

Jun 16 '19 22:06 lox

I'd like to hear more about this! Specifically the approached taken in this PR vs:

Providing a digest here should avoid the need to vendor plugins for the sake of ensuring they're not tampered with (e.g. a tag is force-pushed, or the repo taken over)

Asking folks to figure out the magic to make docker#v3.2.0@git-sha1:f199bc812a7d7c770139fe8f67203010e103d0f8 work I think sounds a little complicated?

Jun 17 '19 00:06 keithpitt

I reckon it’s going to get too complicated to do with flags

Fair call!

I’m still also a bit unsure about whether requiring everyone use digests should be a thing

The intention of the PR wasn't just "Add the ability to specify a SHA in the pipeline.yml if you're worried about security" but more "Figure out a way forward where SHA digests are the default".

Perhaps I'll put down some of the thinking / context, because I'm not entirely sure of the overall future strategy for plugin installation and usage, so I'm not sure if

I can understand the tension from the point of view that it's a human authored file (rather than a yarn.lock for example). And if you're using a plugin from your own private GHE install for example, or your own public GitHub organization, then it might feel like overkill (although if it's your own open-source GitHub plugin, we still want the README examples to include pinning for other peeps).

I guess our situation is similar to a Dockerfile, in that it's human authored, and there's no additional lockfile that's used to pin things down (unlike most other dep systems, like RubyGems/NPM/Yarn). Few people seem to use Docker SHA pinning, which means that the whole ecosystem is insecure-by-default. Remembering to add a SHA, or deciding if you should, is left as an exercise to the user every time they do it—some teams add Dockerfile linters to require it.

Ours is different to the Dockerfile declaration though, because Dockerfile tags are also used for sem-ver like fuzzy version matching (e.g. FROM node or FROM node:12) and the image potentially changing depending on what machine you're running it on. With Docker it's common to push an existing a tag with entirely new content too. With pipeline.yml we recommend always using a full tag it's the same plugin version on all jobs, and that it doesn't just break without you changing anything. This means we recommend people bump plugin version numbers manually, but with the help of tools like Renovate/Dependabot to make it easier, and potentially via BK CLI too.

We don't support sem-ver like deps, and the agent doesn't even pull down the latest master if you refer to just docker-compose rather than docker-compose#v3.2.0. So I feel the format/design of the system already lends itself to full pinning, with the goal/assumption that the plugin code should never change without you touching the the pipeline.yml.

Another thing I'm trying to think though is that our install method is copy+pasting from GitHub readmes into the pipeline.yml. Because the readmes aren't in control, it's been up to the authors to put the full version numbers in there. Luckily it looks like most plugin authors are using our linter 🙌🏼, and that ensures that usage examples include an up-to-date version number, but they're not using https://github.com/toolmantim/boomper to automatically update the readme. Having to manually bump #v3.0.3:git-sha1:f7f2d362fe37c51f7f42f485385314b412416e17 is harder than just #v3.0.3 (though at least the GitHub UI shows the commit SHA for the release pretty easily). I think that the digest I've proposed here is still fairly accessible to plugin authors, and if we were to publish a buildkite-plugin-toolkit GitHub App/Action people can add to their plugin repos we could make it dead simple (it'd be a small rename and reworking of the existing Boomer app). There's a few other things that plugin authors would love to see in that app already too, like having it perform the linter checks for you too.

To sum up the assumptions with this PR so far:

For now we're sticking with recommending pinned plugin versions in pipeline.yml files
For now we're staying with a readme based install method
We want people to include a digest for any third-party plugin use — which means that the copy+paste plugins from GitHub will need to include a digest — so we're secure by default rather than 'secure if you could be bothered'
We have a good chance of moving plugin authors to including digests in their readme, through linter updates and a potentially new plugin toolkit GitHub Action/app

I reckon if we change enough of these assumptions, we should look at a potential new solution too. There's probably some alternative approaches to the same problem if we're thinking of changing direction.

Jun 17 '19 00:06 toolmantim

I'd like to hear more about this! Specifically the approached taken in this PR vs:

Providing a digest here should avoid the need to vendor plugins for the sake of ensuring they're not tampered with (e.g. a tag is force-pushed, or the repo taken over) Asking folks to figure out the magic to make docker#v3.2.0@git-sha1:f199bc812a7d7c770139fe8f67203010e103d0f8 work I think sounds a little complicated?

Yeah, for sure! So vendored plugins solves it for those willing to do that. And there's other things you can do to make things secure too, like: https://github.com/buildkite/buildkite-allowed-plugins-hook-example. They all require a lot of work though, and only a small % of people will do it.

This was an attempt to try to make the default secure, for 99% of uses of plugins, in the same way you don't worry about RubyGems / NPM package security either (they have digests built into their lockfiles, so no-one can just force-push a new git tag to GitHub and own your production builds).

If vendoring plugins were the default way plugins are used, then it'd change everything. But I don't think that's the vision for plugins really, just an option for some.

Jun 17 '19 04:06 toolmantim

How about we start by adding this digest support to the pipeline.yml (as per the PR)? It's optional, and people can use agent hooks to enforce their existence if they want, same as they can for limiting the plugin repo hosts. And we can mention it in the securing guide.

I was already keen to build a "Buildkite Plugin Toolkit" GitHub Action that let's plugin authors keep their readmes up-to-date and SHA'd. At the moment we're using (and recommending) my personal Boomper GitHub app on all our plugins, which is bad. A GitHub Action means we don't have to run our own servers, and we don't have access to their repo. Plugin authors don't want to manually bump version numbers in their readme as it is, so it's pretty easy to do version + digest in there too.

I think these two things gets us 99% of the way to "secure by default" without this being too complicated. And I'm keen to work on the second one anyway, so I can retire my own GitHub App.

Jun 18 '19 00:06 toolmantim

I'm cool with the approach in general (as mentioned on the call).

I'd like to float a different syntax:

steps:
  - command: "go build -o dist/my-app ."
    artifact_paths: "./dist/my-app"
    plugins:
      - docker#v3.2.0+sha.f199bc8:
          image: "golang:1.11"

The things I like about this are:

It's got the semver tick of approval (so potentially the version can be used in other contexts)
It's shorter than the full blown SHA

I'm not sure however of how long a SHA needs to be for it to be "safe"

Jun 21 '19 02:06 keithpitt

@keithpitt 👍🏼

I was all excited and checked out the semver spec:

Build metadata MAY be denoted by appending a plus sign and a series of dot separated identifiers immediately following the patch or pre-release version. Identifiers MUST comprise only ASCII alphanumerics and hyphen [0-9A-Za-z-]. Identifiers MUST NOT be empty. Build metadata SHOULD be ignored when determining version precedence. Thus two versions that differ only in the build metadata, have the same precedence. Examples: 1.0.0-alpha+001, 1.0.0+20130313144700, 1.0.0-beta+exp.sha.5114f85.

Looks like sha.xxx is only an example, and it's mostly about the + sign — and we can do anything after the + sign?

But I just realised, that trying to make it part of the "version" has the problem that it clashes with the plugin's actual version. i.e. If someone was to do a release of their plugin with the tag v3.2.0+something that would be totally valid semver, and would work in our plugin directory as is now (because we use semver), but wouldn't work with this syntax:

steps:
  - command: "go build -o dist/my-app ."
    artifact_paths: "./dist/my-app"
    plugins:
      - docker#v3.2.0+something+sha.f199bc8:
          image: "golang:1.11"

I think having our own clear delimiter between "this is the version part" and "this is the digest part" might be best?

Jun 21 '19 05:06 toolmantim

Personally I really liked @toolmantim's use of the @ like docker uses. I like at least using sha1 as it allows us to move to other hash formats over time.

I view it as independent of the version number FWIW. It's an extra annotation on there.

Jun 21 '19 06:06 lox

IMO this would be just as legit (excluding the branch/tag/version) in favor of the content addressable version.

docker@git-sha1:f199bc812a7d7c770139fe8f67203010e103d0f8

Jun 21 '19 06:06 lox

Some context on how Docker does it: https://engineering.remind.com/docker-image-digests/

I like docker@sha1:f199bc812a7d7c770139fe8f67203010e103d0f8

Jun 21 '19 06:06 lox

I think having our own clear delimiter between "this is the version part" and "this is the digest part" might be best?

I like docker@sha1:f199bc812a7d7c770139fe8f67203010e103d0f8

:+1:

To me it's important to note that while the LHS is version-ish, it's technically a git refname, so we want to avoid a separator character that's likely to be used for that purpose. (To that end, I'd slightly favour a second # over @, because the latter has existing meaning, but both feel more like punctuation than something like +.)

Jun 21 '19 06:06 matthewd

agent agent copied to clipboard

Support plugin SHA-1 digests

agent
agent copied to clipboard