slsa icon indicating copy to clipboard operation
slsa copied to clipboard

Provenance: switch to SPDX?

Open MarkLodato opened this issue 2 years ago • 13 comments

SPDX 3.0 is developing a "build profile" that overlaps almost entirely with the SLSA Provenance schema. In other words, SPDX 3.0 will (hopefully) be able to express everything that SLSA Provenance can, but not vice versa.

I'd like us to consider deprecating the SLSA Provenance format in favor of SPDX 3.0, assuming it meets all our needs. In practice, this would mean the two communities working jointly on a single schema. We would still recommend (but not require) DSSE and in-toto attestations; this is just about the predicate layer. (This topic came up during the 2022-07-21 community meeting.)

Pros:

  • Allows a single attestation to satisfy both SLSA and SBOM. (SPDX is one of the three recognized SBOM formats, while SLSA Provenance is not.)
  • Reduces user confusion and implementation complexity of having two similar-but-not-quite-the-same concepts / formats.
  • SLSA users can possibly take advantage of existing SPDX tooling rather than having to develop something SLSA-specific.
  • Solves the naming confusion between SLSA Level/Requirements and SLSA Provenance.

Cons:

  • Increases the overhead of collaboration and consensus building, especially due to the increased set of stakeholders and use cases.
  • Ties us to the SPDX release process, which is often very long. (Currently SLSA has no fixes release schedule.)
  • SPDX is more complex than the current format and may be less desirable for some reason.

I have started attending the SPDX Build Profile meetings to get a better idea on how close we are to alignment. From what I've seen, it seems like it might be a good fit for SLSA.

Next step: look at a few concrete examples of SPDX 3.0 compared to their SLSA Provenance equivalents to get a feel for whether it's a good fit.

MarkLodato avatar Aug 05 '22 14:08 MarkLodato

Actually, before we even look at the specific format, what is the general opinion on this idea? Do you think it is desirable to merge with SPDX? Is it desirable to stay separate? Are any thoughts on the proposed Pros/Cons?

MarkLodato avatar Aug 05 '22 15:08 MarkLodato

https://github.com/spdx/spdx-3-model/ - Linking this here so we have something to reference.

Do you know the status of the build profile stuff?

There's also some other concerns.

  1. SLSA provenance tends to be a few dozen kilobytes. SPDX docs can be megabytes. If you can pull out the build profile into its own predicate than maybe?
  2. The lifecycle of changes to SPDX historically has taken a really long time.
  3. There are already similar things done by cyclone as well.

mlieberman85 avatar Aug 05 '22 16:08 mlieberman85

To further clarify my thinking: I am not suggesting a different approach to generating provenance nor adding more data to the provenance. Rather, purely changing the syntax of the same data so that it conforms to SPDX. Or maybe another way to phrase, recommend a very specific application of SPDX such that it looks essentially the same as what we have now.

For example, instead of this:

{
  "predicate": {
    "buildType": ...,
    "invocation": { ... },
    "materials": { ... },
    ...
  }
}

You would write (made-up syntax):

```jsonc
{
  "predicate": {
    // Mandatory SPDX fields, I think?
    "@type": "SpdxDocument",
    "@id": "urn:spdx.dev:null-document",
    "specVersion": "3.0",
    "created": "...",
    "profile": ["build"],
    "dataLicense": "...",
    "createdBy": "...",
    "elements": [
      {
        "@type": "Build",
        // Same as SLSA Provenance, possibly with slightly different syntax:
        "buildType": ...,
        "invocation": { ... },
        "materials": { ... },
      }
    ]
  ]
}

I'm still waiting to see an actual real-world example, which I think would help clarify.

MarkLodato avatar Aug 05 '22 16:08 MarkLodato

The lifecycle of changes to SPDX historically has taken a really long time.

Yes, we would be tying ourselves to the SPDX change and release process. I'll add that as a con.

MarkLodato avatar Aug 05 '22 16:08 MarkLodato

Another con is that people have been developing tooling around the existing format. Would those implementer accept a major format change? I suppose that's literally what you're asking here. :)

TomHennen avatar Aug 05 '22 17:08 TomHennen

My take would be that SLSA should continue with its own predicate format. Let it capture the cadence and features the community wants and builds.

In parallel, SLSA could provide guidance, updated with each new version, on how to map the SLSA information to the upcoming SPDX build profile and the CycloneDX provenance and pedigree features to create a SLSA attestation with spdx/cdx predicates (if at all possible).

I think having its own format allows the project to break free from possible expression constraints imposed by the other formats and move at its own speed.

puerco avatar Aug 05 '22 20:08 puerco

@puerco, I'm guessing you're suggesting we could even create a tool to generate SPDX build profiles from SLSA provenance?

TomHennen avatar Aug 05 '22 20:08 TomHennen

If guidance to map attributes is properly codified (and equivalents do exist) it should be possible yes.

puerco avatar Aug 05 '22 20:08 puerco

I support this idea and encourage continued collaboration and alignment between SLSA and SBOM standards in whatever form it takes, e.g. by directly implementing the SPDX 3.0 Build Profile for SLSA Provenance or seeking compatibility between the SLSA Provenance schema and SPDX/CDX.

jeff-schutt avatar Aug 06 '22 02:08 jeff-schutt

As others have pointed out, CDX already has support for pedigree, etc, and we’re working on support for formulation (for any supported component type or service). Mapping between SLSA Provenance and CDX should be possible. CDX v1.5 will include formulation support. Historically we release in Q1 or early Q2 every year. But formulation will likely be ratified before then. We’d like to work with the SLSA community to ensure interop.

cc @coderpatros @darthhater

stevespringett avatar Aug 06 '22 03:08 stevespringett

+1 to @puerco and keeping them separate for now.

dlorenc avatar Aug 06 '22 16:08 dlorenc

OK. Among the responses thus far, it seems like the preference is to keep SLSA Provenance but work with SPDX and CycloneDX communities to align the models, such that one can translate freely between the three formats.

@stevespringett I'd love to take you up on that. Would it make sense for us all (SLSA, SPDX, CDX) to jointly come up with a model, rather than independently designing and then trying to align after the fact?

Any other opinions?

MarkLodato avatar Aug 08 '22 15:08 MarkLodato

@MarkLodato depends on what you mean by a model. CDX formulation is a superset of the SPDX build profile from what I can tell and I would imagine the data models are very different as a result However, I think a common abstract model would be beneficial to everyone and I'd be happy to contribute to it.

stevespringett avatar Aug 09 '22 17:08 stevespringett

I'm closing this as "will not fix" since we decided to keep the Provenance format for v1.0 at least.

That said, let's continue the conversations to align SLSA Provenance, CDX Formulation, and SPDX Build models. @mrutkows joined today's SLSA Spec meeting to explain CDX Formulation in more detail and we agreed to meet again to discuss in more detail. I think there's a lot of synergy here. It's not yet clear if it makes sense to completely deprecate Provenance in favor of CDX or SPDX, but if it does become clear, let's open another issue for that.

MarkLodato avatar Mar 06 '23 19:03 MarkLodato