attestation icon indicating copy to clipboard operation
attestation copied to clipboard

No way to bind attestations in attestation bundle together

Open colek42 opened this issue 3 years ago • 16 comments

In Witness, we use an attestation_collection. Our current implementation nests all of the attestations into a single object, and signs them together. This allows us to bind attestations together if they are part of the same process. For example, if we have an SLSA and a runtime trace attestation, the user should have a way to verify that they happened as a part of the same invocation.

One way forward would be to create an "attestation_collection" type that contains the hashes of the attestations created during the invocation.

The corresponding issue in Witness is here: https://github.com/testifysec/witness/issues/240

colek42 avatar Feb 10 '23 16:02 colek42

To clarify, we have three layers: envelope (signing), statement (subject), and predicate. At what layer would the bundling be desirable in your use case?

  • Multi-predicate (same subject, same signer)
  • Multi-statement (different subjects, same signer)
  • Multi-envelope (different subjects, different signers)

MarkLodato avatar Feb 10 '23 18:02 MarkLodato

We would want to calculate the secure hash over the statements. We do not want to be bound to a specific envelope type.

Also, the collection should expose all subjects for its referencing statements.

@mikhailswift probably has some thoughts as well

colek42 avatar Feb 10 '23 21:02 colek42

Sorry, I'm not sure I understand the use case. Could you expand on your example, preferably giving example data? Like, do you want to just refer to one attestation from another, or sign multiple statements as a unit (in a single envelope), or create a new (signed) attestation that "bundles" multiple existing (signed) attestations?

MarkLodato avatar Feb 10 '23 22:02 MarkLodato

@MarkLodato go to https://judge.testifysec.io/ and type in https://github.com/testifysec/go-witness. Each of the results is an attestation collection. Those attestations happened during the same invocation of Witness.

We need this functionality but want to bring Witness inline with the spec.

image

colek42 avatar Feb 10 '23 22:02 colek42

It sounds like what you want is a predicate bundle, i.e. multiple predicates that all apply to the same subject. Below is an example I downloaded from Witness. I can't find documentation for https://witness.testifysec.com/attestation-collection/v0.1 but I believe the terminology is incorrect - all uses of "attestation" ought to be "predicate". (See https://slsa.dev/attestation-model#model-and-terminology.)

I'll leave it open to the maintainers about whether and how to proceed.

A few open questions for if this is accepted:

  • Should we deprecate/remove the top-level predicateType+predicate, or should we support both?
  • Is there a need for multiple statements, e.g. [{"subject": [...], "predicates": [...]}, {"subject": [...], "predicates": [...]}, ...]? If so, should we design that support at the same time as this?

Example

The following witness bundle:

{
  "_type": "https://in-toto.io/Statement/v0.1",
  "subject": [{...A...}, {...B...}, {...C...}],
  "predicateType": "https://witness.testifysec.com/attestation-collection/v0.1",
  "predicate": {
    "name": "static-analysis",
    "attestations": [
      {
        "type": "https://witness.dev/attestations/github/v0.1",
        "attestation": {
          "jwt": {...}
          "ciconfigpath": "",
          "pipelineid": "3300222393",
          ...
        }
      },
      {
        "type": "https://witness.dev/attestations/material/v0.1",
        "attestation": {
          ".git/FETCH_HEAD": {
            "sha256": "4b6e7d5ad4b2a9fdc96032fb64d7beefde19b1d57c4a265ad798d43b4709ffac"
          },
          ...
        }
      },
      ...
    ]
  }
} 

would look something like this if standardized:

{
  "_type": "https://in-toto.io/Statement/v1.0-draft",  // presumably some future version?
  "subject": [{...A...}, {...B...}, {...C...}],
  "predicates": [
    {
      "predicateType": "https://witness.dev/attestations/github/v0.1",
      "predicate": {
        "jwt": {...}
        "ciconfigpath": "",
        "pipelineid": "3300222393",
        ...
      }
    },
    {
      "predicateType": "https://witness.dev/attestations/material/v0.1",
      "predicate": {
        ".git/FETCH_HEAD": {
          "sha256": "4b6e7d5ad4b2a9fdc96032fb64d7beefde19b1d57c4a265ad798d43b4709ffac"
        },
        ...
      }
    },
    ...
  ]
} 

MarkLodato avatar Feb 13 '23 13:02 MarkLodato

I don't believe multiple statements would be desirable for Witness' use case, I think the multiple predicates is more fitting.

The model we've taken with Witness for our 'attestations' (I'll refer to them as predicates in this post from hereon) is that each predicate can expose some subjects, which we prepend the predicateType onto to avoid collisions within the subjects. Ultimately all of these predicates are working together to make a singular statement about some artifact, though, so I think the multiple predicates within one Statement model fits better here.

Our goal with the many predicate in one statement concept was to allow these predicates to be small and specialized for their use cases if necessary. For example a unit test predicate may expose some unit test specific information, but we still may want the gitlab predicate in our statement to show that the unit test information was originated from some specific gitlab runner for a job running for a specific project.

mikhailswift avatar Feb 13 '23 14:02 mikhailswift

More potential use cases for SLSA:

  • https://github.com/slsa-framework/slsa/issues/620: Perhaps we want to suggest creating multiple provenance predicates representing different levels of abstraction?
  • @lehors brought up the idea of dropping resolvedDependencies in favor of using a standard SBOM format (CycloneDX or SPDX), which would similarly want to bind the two predicates together.

MarkLodato avatar Feb 16 '23 14:02 MarkLodato

We discussed inclusion of https://github.com/in-toto/attestation/issues/136 at today's maintainers meeting. While we understand the use case combining a lot of predicates into one statement can be an anti-pattern in that it prevents users from filtering out the predicate types they're not interested in if they want to only forward some of them (e.g. they're calling an API that has concerns about sending it huge amounts of data).

So we've decided it's too risky to include in v1, but are happy to discuss post-v1.

One idea is to have a predicate type that is a list of references to existing DSSEs. That way the individuals statements could be filtered if needed, but folks still have a way of knowing all the evidence that is bundled together.

TomHennen avatar Feb 17 '23 17:02 TomHennen

Oh note that it may be entirely possible to handle this just by defining a new predicateType, which won't block on versioning.

TomHennen avatar Feb 17 '23 18:02 TomHennen

Was thinking about this problem today. Could the SCAI predicate be used to address this? The predicate needs to be updated to use the v1.0 spec, but I'm thinking it could be used to implement an attestation collection (example below). Thoughts?

Pros: Existing predicate type designed to capture a collection of artifact attributes and evidence

Limitations:

  • How to set the required SCAI attribute field is an open question
  • The content field of the ResourceDescriptor should only be used for artifacts that are less than 1KB; not sure if witness attestations would regularly exceed that or not.
{
    // Standard attestation fields
    "_type": "https://in-toto.io/Statement/v1",
    "subject": [{ ... A ... }],
        
    "predicateType": "https://in-toto.io/scai/attribute-report/v0.2?draft",
    "predicate": {
        "attributes": [{
            "attribute": "https://witness.dev/attestations/github/v0.1",
            "evidence": {               // this will be a ResourceDescriptor in v0.2
                "digest": { "sha256": "abcdabcde..." },
                "content": "<Base64(serialized attestation)>",
            }
        },
        {
            "attribute": "https://witness.dev/attestations/material/v0.1",
            "evidence":  { 
                "digest": { "sha256": "01234567..." },
                "content": "<Base64(serialized attestation)>",
            }
        }],
       "producer": { "type": "https://witness.dev" }
    }
}

marcelamelara avatar Mar 22 '23 00:03 marcelamelara

Another con is that resource descriptor suggests content be < 1K and I suspect this would wind up being quite large.

I think another way to go that could serve everyone's needs (allow predicates to be separable while maintaining integrity and being able to attest to them all at once) would be to have a meta collection type.

E.g.

<dsse 1>
<dsse 2>
<dsse 3>

Where DSSE 1 and 2 are https://witness.dev/attestations/github/v0.1 and https://witness.dev/attestations/material/v0.1 and DSSE 3 is a new predicate type .../evidenceCollection.

{
    "_type": "https://in-toto.io/Statement/v1",
    "subject": [{ ... A ... }],
        
    "predicateType": "https://in-toto.io/evidenceCollection/v1",
    "predicate": {
        // A list of ResourceDescriptors
        "evidence": [{
            // It seems like the predicate type would be useful to have, putting it in name which might make sense
            // Could also be put somewhere else.
            "name": "https://witness.dev/attestations/github/v0.1"
            "digest": {"sha256": sha256(<dsse 1>)},
          },{
            "name": "https://witness.dev/attestations/material/v0.1"
            "digest": {"sha256": sha256(<dsse 1>)},
          }
         ]
       "producer": { "type": "https://witness.dev" }
    }
}

This way you can separate all three DSSEs if you want, but if you want to verify them all together you can grab the evidenceCollection, check that you have each of those DSSEs and verify that collection at once.

TomHennen avatar Mar 22 '23 12:03 TomHennen

Oh, I should note that you could also do that with the SCAI predicate, you just wouldn't include the content directly.

TomHennen avatar Mar 22 '23 14:03 TomHennen

Yes, completely agree that including the content of attestations might be too large in practice. In your example, I guess you're assuming that the evidenceCollection attestation would be included in the same bundle as the DSSEs in the collection? Otherwise, I suppose the evidence field for each DSSE could include the uri or downloadLocation.

marcelamelara avatar Mar 22 '23 16:03 marcelamelara

I guess you're assuming that the evidenceCollection attestation would be included in the same bundle as the DSSEs in the collection?

Right

Otherwise, I suppose the evidence field for each DSSE could include the uri or downloadLocation.

Yes that would work well too. They probably don't need to be mutually exclusive.

TomHennen avatar Mar 22 '23 16:03 TomHennen

Ok! Given that there are also a few other use cases for creating evidenceCollections, maybe the next step should be to update the SCAI predicate to use ResourceDescriptors, and add an example or two showcasing how one might use the predicate to create evidenceCollections.

marcelamelara avatar Mar 22 '23 17:03 marcelamelara

@colek42 @mikhailswift @MarkLodato I think that the existing SCAI predicate may be able to address the use cases listed in this issue. PTAL at #170 . Would appreciate any input!

marcelamelara avatar Mar 27 '23 18:03 marcelamelara