cosign Document Key/Account Compromise Recovery

We should document how rekor can be used to detect and recover from a compromised account or key.

Sep 15 '21 13:09 dlorenc

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Sep 13 '22 02:09 github-actions[bot]

Obviously the project can prioritize this against goals and other work, but it does seem like the lack of a revocation story should be highlighted as a risk for adopters trying to build a robust security strategy around cosign.

Sep 13 '22 03:09 mmdriley

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Nov 14 '22 02:11 github-actions[bot]

I view key compromise as largely external to the Sigstore project. Sigstore aims to provide a secure mapping from signatures to identities ("artifact X was signed by [email protected] at time T"). Revocation seems to be more about whether (and when) you should trust a particular key/identity.

Any use of Sigstore needs to answer questions of that form ("whom should I trust?") for itself. The (currently draft) blog post Signatus, ergo Securus? (you may need to join [email protected] for access) describes potential policies of that form in a few contexts (package repositories, internal to an organization). But that's the level at which to handle revocation. So if you use TUF (as recommended in that post), you should use TUF's support for revocation. If you hard-code a key/identity for verification, you need to support revocation at the point of hard-coding (Don't Panic walks through an example of what that might look like). These all follow the broad philosophy of "revoke artifacts, not keys".

Now, just because this is external to Sigstore doesn't mean that the Sigstore project should stay silent here, so I'm not going to recommend closing this issue quite yet. But I think this is going to fall under the bucket of "advice for adding Sigstore to your setting" and not "documentation about Sigstore itself."

Nov 14 '22 15:11 znewman01

From my reading at least, Sigstore already has a strong, opinionated stance on how trust should work: infrastructure should trust code signed by any key that an intermediary (rekor) has attested is linked to a specific OIDC identity. But there is no provision at any of those layers for what a client should do when they lose control of that identity for a period of time.

I find it hard to accept the idea that organizations should have to invent that story from first principles. Sigstore is a code signing tool, and code signing only prevents attacks when signature verification fails. The ecosystem will be shaped by the attacks anticipated by Sigstore's threat model and the facilities put in place ahead of time that make recovery possible after a security incident.

A great example is the Sigstore policy-controller admission controller for k8s. One well-documented way for configuring that controller is to trust anything signed by a particular OIDC-attested identity. So... if that gets popped, do I change everything in prod to be signed by [email protected] ?

Dec 23 '22 09:12 mmdriley

I appreciate you pushing us on this matter, as this is a really important point 🙂 I think that we're mostly in agreement here.

From my reading at least, Sigstore already has a strong, opinionated stance on how trust should work: infrastructure should trust code signed by any key that an intermediary (rekor) has attested is linked to a specific OIDC identity.

Kind of. I would phrase this as "should trust code satisfying some policy based on information that an intermediary (rekor) has attested to", with "signed by this particular key" as one example of such a policy.

I find it hard to accept the idea that organizations should have to invent that story from first principles.

Totally agreed; that's why we're leaving this issue open.

The ecosystem will be shaped by the attacks anticipated by Sigstore's threat model and the facilities put in place ahead of time that make recovery possible after a security incident.

Also agreed. I think where the confusion comes from is that I was a bit sloppy above: when I say "external to Sigstore" I didn't mean that "the Sigstore project shouldn't provide tooling and a good story for users here." Rather, I meant that we shouldn't use Rekor as canonical source of information about revocation for a couple of reasons:

Transparency logs are really good at proving something is in the log; they're not good a proving something isn't in the log. This is a big part of why revocation transparency, despite a number of proposals (some quite complex), hasn't rolled out in web PKI.
Even if we assume that Rekor is trusted to answer queries honestly, using it as a CRL imposes huge load: it means that every check needs to be online for verifiers to ask "has this been revoked yet?" or we need to invent some form of stapling (which requires frequent refreshing, bad for distributing artifacts over CDNs).

The above point to a common theme: revocation is really hard in web PKI because it's expected to be done by the PKI itself, and revocations from everybody need to go to the same place, causing issues with scale. In the code signing setting, we have a huge challenge: the question you're trying to answer in web PKI is "does this cert match the string in my URL bar;" in code signing, you're trying to answer "is this artifact 'good?'" The answer to that will be quite application-specific and require an application-specific policy.

That's also an advantage for revocation, as it's much more scalable to handle revocation per-distribution platform. For instance: a package repository like PyPI or npm could handle revocation in their package managers. This means that techniques like CRLs, which would fail at the scale required to handle every revocation in the same place, should actually work! Another setting might find stapling to be acceptable; still another might be fine with manually blocklisting bad signatures (redeploying the policy post-compromise). I don't think there's a one-size-fits-all answer.

I don't mean to suggest that each of these applications should invent their own solution here. But the Sigstore project should be in the business of providing tools to help craft verification policies that handle revocation, not necessarily building it into the infrastructure itself. Normally, I'm skeptical of security tools that leave users to figure things out for themselves—but I think this is reasonable because each user needs a verification policy anyway.

One well-documented way for configuring that controller is to trust anything signed by a particular OIDC-attested identity. So... if that gets popped, do I change everything in prod to be signed by [email protected] ?

My suggestion would be: instead of trusting "[email protected]", you trust "[email protected] except from time T to time T'." Or, perhaps, "[email protected], except blocklisted artifacts X, Y, and Z" (and you can get that list with high confidence because of the transparency log). You can do this (in theory, not all implemented yet) either by modifying the policy-controller configuration directly and redeploying, or by having policy controller hit some external service to fetch the blocklists.

I know that was wordy, sorry 😄 Didn't have time to make it shorter. But in summary (all IMO, and I'm just one person):

I agree totally that revocation is a critical problem, and the Sigstore project should address it.
However, the infrastructure/PKI layer is the wrong place.
Rather, Sigstore should make it easy to craft policies that handle revocation at verification time. This should include improvements to tooling (including Cosign, policy-controller, and client libraries).

CC @haydentherapper who may be interested in this discussion as well

Dec 23 '22 15:12 znewman01

Thanks for the thoughtful reply, and for going into a lot more detail on how you're thinking about the responsibilities of different parts of the larger Sigstore project.

I'm really glad we agree that revocation is something the ecosystem needs to have an opinion on and a paved path for. And I'm totally with you there's a lot we can learn (and hopefully avoid) from the generations of less-than-successful solutions to revocation in the WebPKI.

I guess my biggest concern is -- as the k8s policy-controller example shows -- the premiere "V1" end-to-end narrative around Sigstore centers on verifying (only) OIDC identity. I'm worried that moving the ecosystem to a place where revocation is possible is going to require a step-function difference in how much verifiers and signers need to coordinate versus just meeting at something shaped like an email address.

(As a comparison: the move from traditional, expensive, high-friction TLS certificates to the brave new world of Let's Encrypt required inventing ACME for automation and moving the ecosystem to shorter-lived certificates with transparency logging to reduce the blast radius and increase the visibility of improperly issued certificates.)

I'm on the record as a huge fan of thinking about policy in signing infrastructures. But I think there's some urgency in Sigstore finding some equilibrium points in the design space of tradeoffs in usability, security, and performance/scalability/reliability, documenting those, and evangelizing them through the infrastructure so common deployments of Sigstore can recover from common attacks like credential compromise.

Dec 24 '22 07:12 mmdriley

We don't have a preferred way currently to specify who can sign for what. TUF seems to be a good candidate for this - PEP480 (WIP) for example discusses this, as does @znewman01 and @mnm678's blog post.

+1 to @znewman01, I don't think we should build revocation support into the infrastructure. There's too many questions with building this at the infrastructure layer - How do we handle indefinite growth of a revocation list? How do we efficiently ship that list to clients? How does an identity owner prove ownership over a compromised identity (webPKI uses the private key to self-sign a statement, but private keys are ephemeral for sigstore, and what if an identity owner lost control of their OIDC account)?

Pushing this to the clients, we need to make it easy to build policy. I think it'd be worth exploring if we can build the policies we want to recommend ("trust identity X to sign for artifact Y", "don't trust identity A from time S to T", etc) using existing policy languages (cue, rego). Any gaps should be filled by sigstore libraries (maybe we extract policy logic out of policy-controller into its own library too).

Jan 03 '23 18:01 haydentherapper

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Mar 05 '23 02:03 github-actions[bot]

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

May 06 '23 01:05 github-actions[bot]

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Jul 06 '23 02:07 github-actions[bot]

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Sep 05 '23 01:09 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

Sep 10 '23 01:09 github-actions[bot]

cosign cosign copied to clipboard

Document Key/Account Compromise Recovery

cosign
cosign copied to clipboard