transparency-exchange-api Approach to Collections' Publisher API

Our current understanding of a Collection is a state of Artifacts (metadata documents) belonging to a Release.

A new version of Collection is automatically created once the state of Artifacts changes.

With this understanding, having implicit Publisher CRUD API for Collections seems problematic for the following reasons:

A Collection is implicitly created when a Release is created. Imagine, we just created a Release with no Artifacts - it would still have a Collection with Version 1, that contains empty set (that is by definition).
A Collection cannot be deleted if its Release still exists.
If any CRUD operation (except Read) happens on Release Artifacts, that automatically updates a Collection - in that sense, explicit Update call on collection is not very meaningful.

With all that, the only solution I see is to avoid explicit Publisher API on Collections, and instead treat them as Lamport timestamp for Release Artifacts (ref: https://en.wikipedia.org/wiki/Lamport_timestamp).

I believe we should discuss that / would be great if anybody has a better proposal.

May 11 '25 21:05 taleodor

TEA Artifacts are immutable and reusable and can be part of multiple TEA Collections. Currently:

there is no concept of TEA Artifact update, artifacts can only be added and each new "version" of an artifact will have a different UUID.
TEA Collections must be updated explicitly to include the new UUID of the Artifact.

Even if we add a version field to TEA Artifact to follow its evolution in time, it is always possible that the TEA Artifact will be updated in some collections, but not in others.

It is fair to assume that a single VEX file will be shared between multiple TEA Collections. However, once a TEA Collection (or more precisely the associated TEA Component Release) reaches EOL, the TEA Collection will no longer receive updates of the VEX file.

May 13 '25 07:05 ppkarwasz

We actually have concept of artifact update, but it's not expanded.

Particularly, we have collection update event enum, which includes: ARTIFACT_UPDATED and special case for VEX_UPDATED.

artifacts can only be added and each new "version" of an artifact will have a different UUID

This is actually a point to discuss. Personally, I don't believe this should be done this way. For example, CycloneDX explicitly has version field within the SBOM. So you want to have some traceability for a new version of the same SBOM with the same serial number.

I would rather add a version field to the artifact object to match that one of the document fs present and if not present in the document itself, then we can rely on the version field within TEA to trace updates.

Even if we add a version field to TEA Artifact to follow its evolution in time, it is always possible that the TEA Artifact will be updated in some collections, but not in others.

We could then change artifact representation in collections from a list of Artifact UUIDs to a list of Tuples (Artifact UUID, Artifact Version). Alternatively, we can keep Artifact UUIDs unique, but then we need to have separate fields for Serial Number and Version within the Artifact object for traceability. I prefer (Artifact UUID, Artifact Version) tuple.

May 13 '25 11:05 taleodor

artifacts can only be added and each new "version" of an artifact will have a different UUID

This is actually a point to discuss. Personally, I don't believe this should be done this way. For example, CycloneDX explicitly has version field within the SBOM. So you want to have some traceability for a new version of the same SBOM with the same serial number.

I would rather add a version field to the artifact object to match that one of the document is present and if not present in the document itself, then we can rely on the version field within TEA to trace updates.

I totally agree, I created #155 to fix that.

Even if we add a version field to TEA Artifact to follow its evolution in time, it is always possible that the TEA Artifact will be updated in some collections, but not in others.

We could then change artifact representation in collections from a list of Artifact UUIDs to a list of Tuples (Artifact UUID, Artifact Version). Alternatively, we can keep Artifact UUIDs unique, but then we need to have separate fields for Serial Number and Version within the Artifact object for traceability. I prefer (Artifact UUID, Artifact Version) tuple.

I think we could implement both: a uuid/version that would apply to every type of document and an artifactId that will be used for the BOM-Link of CycloneDX documents and document namespace for SPDX. For example:

{
  "uuid": "f08a6ccd-4dce-4759-bd84-c626675d60a7",
  "version": 123
  "artifactId": "urn:cdx:f08a6ccd-4dce-4759-bd84-c626675d60a7/123"
  ...
}

for CycloneDX and:

{
  "uuid": "f08a6ccd-4dce-4759-bd84-c626675d60a7",
  "version": 123
  "artifactId": "https://apache.org/spdxdocs/log4j-core-f08a6ccd-4dce-4759-bd84-c626675d60a7/123"
  ...
}

for SPDX.

May 13 '25 12:05 ppkarwasz