Proposal to deprecate unused Rekor kinds in public instance: alpine, rpm, rfc3161, jar, tuf
Note: This is for the public instance only. Private deployments will be unaffected as the code will not be removed at this time.
Background
Rekor supports a variety of entry types, or "kinds", so that clients do not need to know how to parse and canonicalize a signed artifact. The most commonly used kind is "hashedrekord", which is simply a hash of the artifact along with its signature and public key or certificate. Types like "jar" are supported so that a client can simply send a JAR and Rekor will extract the artifact hash, signature and verifier from the object.
Data
As of 4/11/2023, out of ~85 million entries in the public instance, we have:
- 4 alpine entries
- 33 rpm entries
- 112 jar entries
- 221 rfc3161 entries
- 1406 tuf entries
Checking the last integrated timestamp for each of the above kinds:
- alpine = 2022-09-23
- rpm = 2023-08-11, before that 2022-09-23
- jar = 2024-01-29, before that 2023-08-21
- rfc3161 = 2022-09-22
- tuf = 2024-04-11, it appears to be one entry every day
Alpine, RPM, and RFC3161 are effectively unused.
The last JAR upload was a test JAR, and before that, appears to have been by @sabre1041. @sabre1041, are you aware of any need to support JARs in the public infrastructure?
Based on searching GitHub for matching key IDs, the TUF file appears to be from Datadog, matching this file. @trishankatdatadog, are you aware of any reliance on the TUF metadata?
Proposed Deprecation Timeline for Public Instance
In a month or so, we will temporarily disable new entry uploads for these types and wait a week to see if there's any impact reported by the community.
Following that, we will permanently disable uploads for new entries.
Note it will still be possible to retrieve and verify old entries, just not upload new ones.
What to do instead
Move parsing logic client-side and upload hashedrekord records.
Continued code support
We won't remove these kinds from the codebase or else this would break verification for older entries. Also note that it will still be possible to deploy a private instance with these kinds enabled.
We will revisit the need for all of these types as part of a simplification of the API as part of a V2 design.
Based on searching GitHub for matching key IDs, the TUF file appears to be from Datadog, matching this file. @trishankatdatadog, are you aware of any reliance on the TUF metadata?
These are used only to publicly record every new timestamp metadata for one of our TUF repositories. We don't need to rely on any specific TUF kind. I'll take a look next week at using the hashedrekord kind instead. Thanks for pointing this out!
Strong +1 from me -- handling the differences between each type is a source of lots of subtle bugs on the client side, and migrating everything (or as much as possible) over to hashedrekord means that the Rekor instance never learns or has to operate on the full pre-image.
For completeness, the other kinds, from most to least used:
- hashedrekord
- intoto - Usage should be trending downwards as the
dssekind should be used. I know many clients still using this though for the attestation storage feature. - rekord - I don't know why this is being used as
hashedrekordsubsumed it. Will look more into this, but this won't be deprecated regardless. - dsse
- helm - Will need to look into this more, but there is still daily usage. Edit: Seems to be just one user based on the PGP key ID. If anyone is using this kind, lemme know. Edit-Edit: This is used by helm-sigstore.
We may want to add support for other verifiers, e.g not just public keys or certs but also pgp, in hashedrekord, so that rpm or helm charts that are signed with pgp can be uploaded. There's already support in rekord, we just need to copy this over.