`ResolutionRequestStatus` passes back source ref data in structured way
Feature request
Remote ResolutionRequestStatusFields currently only has a Data field that stores the string representation of the resolved content. It would be great to add extra structured field SourceRef into ResolutionRequestStatus whose type is the configSource that SLSA defines.
current
type ResolutionRequestStatusFields struct {
// Data is a string representation of the resolved content
// of the requested resource in-lined into the ResolutionRequest
// object.
Data string `json:"data"`
}
desired
type ResolutionRequestStatusFields struct {
// Data is a string representation of the resolved content
// of the requested resource in-lined into the ResolutionRequest
// object.
Data string `json:"data"`
// SourceRef
SourceRef intoto.ConfigSource
}
Use case
As discussed in https://github.com/tektoncd/pipeline/issues/5522, we need to pass the source ref information to Run status so that Chains can pick up the information and record the link back to origins in the SLSA provenance.
Currently, we are trying to do this through annotations i.e. for git resolver https://github.com/tektoncd/pipeline/pull/5397. The problem with that is annotations are not structured. This might add confusions about what those annotations are and make it hard to find how it interact with existing annotations in the Run object.
Had this discussion with @wlynch in today's S3C meeting.
@abayer Please comment here if you have any questions/concerns. Happy to take on this if we are happy with this. Thanks!
So there's the additional problem that we can't guarantee structured resolution source information - that's going to depend on the resolver implementation itself. We can control that for git, bundles, cluster, and hub, but not for any third-party resolvers that are written. I'm also a bit wary of using a struct because even just with those four resolvers, we've got 5+ from git (I forget off the top of my head if more are added in #5397), 3 in bundles, and 2 in cluster (hub currently doesn't have any annotations), so that would be a pretty dang big struct. It really feels like we're just going to end up approximating a map anyway.
It just needs to be structured enough that we can report back in a general way to record what was fetched in build provenance. https://github.com/in-toto/attestation#provenance-example has an example of what this should look like.
A pURL + digest identifier would do the trick. As @chuangw6 mentioned, Intoto ConfigSource is what we're ultimately looking to populate - the schema is pretty flexible so resolvers can self-determine what the format of the pURL is + what revision types they want to support.
I'm also a bit wary of using a struct because even just with those four resolvers, we've got 5+ from
git(I forget off the top of my head if more are added in #5397), 3 inbundles, and 2 incluster(hubcurrently doesn't have any annotations), so that would be a pretty dang big struct. It really feels like we're just going to end up approximating a map anyway.
If we start using the structured intoto configSource, I think we wouldn't need to create those annotations at all. Passing the structured SourceRef from ResolutionRequest to pipeline reconciler should be sufficient, which is essentially what Chains needs for the provenance.
type ResolutionRequestStatusFields struct {
// Data is a string representation of the resolved content
// of the requested resource in-lined into the ResolutionRequest
// object.
Data string `json:"data"`
// SourceRef
SourceRef intoto.ConfigSource
}