ratify
ratify copied to clipboard
ratify does not support multiple store exists
What happened in your environment?
When I create two store cr in my cluster. The first one is empty auth provider. The second is azure workload identity. My verification and mutation often reports 401 when verifying image from private registry. The workload identity permission and store should be set up correctly since the mutation sometimes can succeed.
From log below, you could found the mutation works for xinhl003acr.azurecr.io/net-monitor gatekeeper.log
My repro step:
- create AKS cluster with OIDC issuer and workload identity enabled
- install ratify via helmfile
- patch the ratify deployment with
azure.workload.identity/use: "true"
- create one more store with auth provider azureWorkloadIdentity
What did you expect to happen?
No response
What version of Kubernetes are you running?
1.26
What version of Ratify are you running?
v1.0.0-rc.6
Anything else you would like to add?
No response
Are you willing to submit PRs to contribute to this bug fix?
- [ ] Yes, I am willing to implement it.
Thanks for reporting this issue @fseldow. Here's my initial analysis on the issue:
It seems that Ratify does not have proper error handling implemented in the case of multiple stores. Currently, Ratify dispatches verification go routines per referrer store configured. But, if there is an error return by any store API call during each store's routine, ALL routines are terminated. https://github.com/deislabs/ratify/blob/de1468300c6fed28fc07aa76becae20937789d6f/pkg/executor/core/executor.go#L154
In your scenario, I believe that the docker config default store's routine fails early causing the workload identity configured store routine to also terminate.
Potential fix: do not throw on first error from any store routine. We should catch that.
Open Questions:
- When and how do we bubble up the failure from a particular store?
- How do we know which store was eventually used for the verification report generated?
cc: @binbin-li
Side note about auth providers: Currently, the only way to have multiple auth providers is to create multiple stores as @fseldow has done here. IMO I don't think this is a great experience. Each referrer store should instead encapsulate unique artifact stores and not same artifact stores with unique auth schemes as we require now.
I propose we rework our auth providers so a keychain of auth providers can be provided. We should also add prefix matching support so we can select a configured auth provider depending on the image reference. (e.g use anon for ghcr + MAR + ews public gallery; use Azure WI for ACR and ECR IRSA for ECR private registries)
thanks for reporting this issue @fseldow. Unfortunately Ratify is currently designed to work with one Store, scenarios for multiple store has been untested. With current architecture allowing multiple store has following challenges:
- If the configured store is intended to be used for different registry, a different auth is probably configured. We will be sending registry credentails to the wrong store which poses security risk
- If stores returns the same artifact, ratify currently don't have dedup logic which can cause more delays in verification. ( we current have a 5sec time limit)
- We don't have logging support to diagnose scenarios on which artifact was verified from which store. Allowing more store introduces ambiguity.
We have a issue 4 to introduce label matching for auth config probably post 1.0. We will also wait for more user requirement to help plan out supporting multiple stores in the future.
Given we are heading to GA soon, we are likely to ship with the current design. @fseldow , lets have more discussion around how to unblock you with the current limitation. //cc @akashsinghal
@fseldow we plan to add multiple store support post v1, please let us know if it's a blocker for you.