docs: Proposal: Source Verification Policies
This is a 2.0 for an existing proposal proposed by @jannfis a while ago: https://github.com/argoproj/argo-cd/pull/14964
I would like to receive some initial thumbs up/down on the general direction and eventual mergability for the RFE. The details might evolve as I will work on the PoC implementation...
TL;DR
Abstract current GIT+GPG verification by something more powerful:
- In
AppProject, replace list.spec.signatureKeys([]string), by a struct at.spec.sourceIntegrity. Struct permits adding future fields to support other sources than git, and other methods of verification that GPG. - For GPG
- Specify repository-specific policies permitting fine-grained control of blessed key sets.
- Introduce option to verify the git history is signed (not just current commit/tag).
- The keychain management remains unchanged
- Implementation wise, this moves the source integrity criteria evaluation to repo-server. Manifests are not created (nor sent back) for sources its integrity was not verified successfully.
Example
apiVersion: argoproj.io/v1alpha1
kind: AppProject
spec:
sourceIntegrity:
git:
policies:
- repos:
- "https://github.com/foo/*"
gpg:
mode: "none|head|strict"
keys:
- "0xDEAD"
- "0xBEEF"
Changes from the original proposal by @jannfis
-
progressivestrategy is removed as suggested in the original proposal, as it was responsible for large part of the risks and corner cases.- An alternative is proposed as an optional extension, because the functionality is generally quite desirable.
- The yaml config format for policy have changed, providing greater flexibility for future needs.
Checklist:
- [x] Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
- [x] The title of the PR states what changed and the related issues number (used for the release note).
- [x] The title of the PR conforms to the Title of the PR
- [n/a] I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
- [n/a] I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
- [x] Does this PR require documentation updates?
- [x] I've updated documentation as required by this PR.
- [x] I have signed off all my commits as required by DCO
- [n/a] I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
- [ ] My build is green (troubleshooting builds).
- [ ] My new feature complies with the feature status guidelines.
- [ ] I have added a brief description of why this PR is necessary and/or what this PR solves.
- [ ] Optional. My organization is added to USERS.md.
- [ ] Optional. For bug fixes, I've indicated what older releases this fix should be cherry-picked into (this may or may not happen depending on risk/complexity).
:x: Preview Environment undeployed from Bunnyshell
Available commands (reply to this comment):
- :rocket:
/bns:deployto deploy the environment
I like the fact that the source verification is extensible - I've been thinking about the best way to incorporate source checking for OCI repositories, and this seems extensible enough to allow for that in the future.
From a practical standpoint, how and where is this meant to be implemented? It seems the place for this would be the repo-server, where we would pass along the required attributes. Are these checks supposed to happen as a pre-cursor before running GenerateManifests?
I like the fact that the source verification is extensible - I've been thinking about the best way to incorporate source checking for OCI repositories, and this seems extensible enough to allow for that in the future.
Do you have more details about this use case? Not to incorporate it, I just want to make sure the API design is versatile enough to support that in the future.
From a practical standpoint, how and where is this meant to be implemented? It seems the place for this would be the repo-server, where we would pass along the required attributes.
As I play with the implementation right now - yes, it sends the criteria to repo-server and validation results back for server to let the sync fail depending on the result.
Future strategies, can do the validation there as well, but they are not required too.
Are these checks supposed to happen as a pre-cursor before running
GenerateManifests?
Not sure yet. I intend to verify where gpg used to.
Though, and this only maybe answers your question, if we send the entire criteria to repo-server, we can make the decision about (not) generating manifests and sending them back there. Although, it is going to save time and bandwidth only in error cases, so it will likely not be a huge win.
Do you have more details about this use case? Not to incorporate it, I just want to make sure the API design is versatile enough to support that in the future.
I don't have that much detail at this time; this isn't really something I've given too much thought. The gist of it is that when we pull an OCI Image it would be good to verify the image layers by its digest. There are a few ways to do this, the most likely one I'd probably start with would be to use Sigstore to verify the digest of the downloaded OCI artifact image. An example can be found here (the example has a lot of knobs and can be simplified for our use case)
From the perspective of your proposal I'd probably extend it in the style of
apiVersion: argoproj.io/v1alpha1
kind: AppProject
spec:
sourceIntegrity:
oci:
policies:
- repos:
- "oci://my-docker-registry/foo/*"
sigstore:
# TODO: Add a ton of sigstore attributes
@blakepettersson: Yeah, that definitely sounds implementable...
-
Is there likely to be any performance issues validating all the commits are signed for repos that may have thousands or more commits?
-
Does every commit need to be signed or just the ones that impact the path in the destination? I'm assuming every commit in the repo would need to be signed which is fine but just thinking of folks with mono-repos?
-
Would it make sense to have a mode called
anywhich would mean all commits need to be signed by someone but we don't need to validate the keys? I'm thinking of open-source projects with hundreds of contributors, for example at Red Hat we have the gitops-catalog (https://github.com/redhat-cop/gitops-catalog) which is currently unsigned but if we were to start signing it folks might want to validate a signature is present but not have to deal with the minutae of who signed it specifically. Similarly for referencing helm charts in git repos that are coming from vendors.
- Is there likely to be any performance issues validating all the commits are signed for repos that may have thousands or more commits?
There will be some penalty. I have tested it with this repository of 10K commits, and it took about 20 seconds to verify them all (on SSD). I was quite impressed by git scalability to be honest, but I see this can be annoying. Note it only affects the most secure of the modes, where we can expect people to be ready to pay the price in exchange for the increased security. AND, they can very likely do a performance optimization by creating a seal commit (but I have not verified that it will help, and how much).
- Does every commit need to be signed or just the ones that impact the path in the destination? I'm assuming every commit in the repo would need to be signed which is fine but just thinking of folks with mono-repos?
Every commit. We can think of relaxing this in the future, but I am concerned about cross-dir references, so sources from different path can influence the resulting manifests, without being checked for integrity.
- Would it make sense to have a mode called
anywhich would mean all commits need to be signed by someone but we don't need to validate the keys? I'm thinking of open-source projects with hundreds of contributors, for example at Red Hat we have the gitops-catalog (https://github.com/redhat-cop/gitops-catalog) which is currently unsigned but if we were to start signing it folks might want to validate a signature is present but not have to deal with the minutae of who signed it specifically. Similarly for referencing helm charts in git repos that are coming from vendors.
It would be more secure than nothing, but trivially avoidable for motivated attacker 🤷. The great news this can be easily added after this refactor.
@jannfis, @pasha-codefresh, @blakepettersson. thanks for helping me to drive this proposal thus far! There are a number of open questions that I would like to get opinions on.
- Policy matching: currently policy can be specified for repo URL glob pattern only. Is there a use-case for using some other criteria? I am thinking of using branch-specific policies. Even when not implemented right away, it can be useful to change the API for enable those in the future. From:
sourceIntegrity:
git:
policies:
- repos:
- "https://github.com/foo/*"
# other policy attributes
to
sourceIntegrity:
git:
policies:
- for:
repo: "https://github.com/foo/*"
branch: "master" # This may be implemented when the need arise, but not with the current API design.
# other policy attributes
- I noticed that resources are generated (and the repository is accessed) before it Argo CD verifies the GPG signatures, or even when it fails. It feels wrong to me as it can cause damage in the repo-server container (plugin source type that executes script from the repo). Is this something we want to reverse, first verify integrity then build manifests? With this implemented, the source integrity "verdict" is available in the repo-server so it will not require any additional gRPC calls, but it would likely require changing in the response, that returns manifests even with source integrity not validated.