pipeline icon indicating copy to clipboard operation
pipeline copied to clipboard

Feature: Name based Remote Resolution

Open wlynch opened this issue 1 year ago • 3 comments

Feature request

Previously you used to be able to reference OCI bundles via a simple bundle field:

taskRef:
  bundle: docker.io/myrepo/mycatalog@sha256:abc123

but this has been marked as deprecated. 😢 the new way is to use remote resolution, which looks much more verbose:

taskRef:
    resolver: bundles
    params:
    - name: bundle
      value: docker.io/myrepo/mycatalog@sha256:abc123

I liked the simplicity of the field based approach - I want to see if we can bring a version of this back (at least for simple cases), and fall back to params for everything else. 🙏

Use case

name is already a part of the remote resolver interface yet it is unused in the actual implementation.

I think we can use this to provide a short hand for remote resolution params for well-known types, e.g.:

taskRef:
  name: bundle://docker.io/myrepo/mycatalog@sha256:abc123

Using git as another example:

taskRef:
  name: git://github.com/wlynch/test@main#task.yaml?token=secret-name

which would resolve to:

taskRef:
    resolver: git
    params:
    - name: url
      value: github.com/wlynch/test
    - name: revision
      value: main
    - name: pathInRepo
      value: task.yaml
    - name: token
      value: secret-name

And you could still mix-and-match params + name for anything that doesn't conform / isn't easily represented in a name-based format.

taskRef:
  name: git://github.com/wlynch/test@main#task.yaml
  params:
    - name: token
      value: secret-name

This should be backwards compatible - :// can't be in a valid k8s object name name, so there's a clear opt-in for this behavior. This also won't effect existing remote resolution specs, since name is currently unused, which also means arbitrary resolvers can continue to be used.

cc @chitrangpatel

wlynch avatar Dec 20 '23 04:12 wlynch

Previously you used to be able to reference OCI bundles via a simple bundle field:

taskRef:
  bundle: docker.io/myrepo/mycatalog@sha256:abc123

but this has been marked as deprecated. 😢 the new way is to use remote resolution, which looks much more verbose:

taskRef:
    resolver: bundles
    params:
    - name: bundle
      value: docker.io/myrepo/mycatalog@sha256:abc123

This is a missing some bits.

# Before
taskRef:
  name: task-or-pipeline-to-get-from-bundle # to know which "resource" to get from the bundle
  bundle: docker.io/myrepo/mycatalog@sha256:abc123
# After
taskRef:
  resolver: bundles
  params:
  - name: bundle
    value: docker.io/myrepo/mycatalog@sha256:abc123
  - name: kind # probably optional, "can be inferred"
    value: task
  - name: name 
    value: task-or-pipeline-to-get-from-bundle

I liked the simplicity of the field based approach - I want to see if we can bring a version of this back (at least for simple cases), and fall back to params for everything else. 🙏

I think you liked because it was more concise, not really more simple. The current syntax is verbose, but definitely not complex. In that regard, git://github.com/wlynch/test@main#task.yaml?token=secret-name or bundle://docker.io/myrepo/mycatalog@sha256:abc123#task:name-of-the-task is probably slightly more complex (to read or to write for the user, and to parse probably).

If we dig a bit more:

  • It would need to be bundles:// instead of bundle:// (because of the name of the resolver)
  • What make a "field" to be part of the main scheme ({name}://{main-part}?{options}) and which one not ? e.g. for the git resolver, the url is probably obvious, but you can also use repo + org (without a url).
  • Some resolver might be confusing to users. For example, git://… is a "scheme" support by the git command-line ; for a user who knows that, is it that "scheme" or something else ? Same for, the http resolver, would be use https://{user}:{password}@example.com ? What about the http-password-secret-key ? Or should we use http://example.com/…?http-username=foo&http-password-secret-key=my-secret ? Here, http is for the name of the resolver, not the real http protocol — what happens if I want to use HTTP without TLS or with, do I use http://https://… ?
  • We can "decide"/"infer" how it "reslove" to for built-in resolvers, but how would it work for non built-in ? e.g. given foo://bar#baz?toto=tata, what should it resolve to ? (aka their is not one param name required in the spec, so different resolvers can have completely different params)
  • From the previous item, would we restrict this to only the built-in resolvers ? It might also confuse some people, where they have different syntax depending on which resolver they use ?

What if we had a slightly different syntax (I think that was even in the inital TEP, we just moved away from it to stay consistent with how params is in other part of the API) ? (kind of what we want to do for v2 already)

# Before
taskRef:
  name: task-or-pipeline-to-get-from-bundle # to know which "resource" to get from the bundle
  bundle: docker.io/myrepo/mycatalog@sha256:abc123
# After
taskRef:
  resolver: bundles
  params:
  - bundle: docker.io/myrepo/mycatalog@sha256:abc123
  - kind: task # probably optional, "can be inferred"
  - name: task-or-pipeline-to-get-from-bundle

I tend to like the idea of conciseness, but I would want to make sure we are not making it more confusing, and that we can treat all resolvers the same.

vdemeester avatar Dec 20 '23 07:12 vdemeester

I like @wlynch ideas, I think we could have both and not everything in the "concised" form need to address every key/value in the verbose one....

I understand @vdemeester arguments that may be bring some confusion but it's only if we wanted to address every resolvers and every parameters in the concise form? Maybe if we are restricting it to the most common and simple use case and leave the one who bring confusion to the more verbose form?

(as anecdote we are keeping the annotations for PAC remote tasks resolution for the simple use cases since it's easy to express and decided that for the advanced use cases the users should use tekton resolvers)

chmouel avatar Dec 21 '23 12:12 chmouel

I agree with @wlynch here that we should use the name field to make it less verbose (also agree with @vdemeester that less verbose does not mean simple) for simple cases. For advanced use cases, they still have the params that they can reference.

One catch though is that the resolvers may need to interpret the name since the name format could be unique to that resolver. May be the pipeline controller only needs to understand which resolver to send the request to.

(Also, tangential so let's not discuss this in this thread, may be in the doc about what you think about upfront remote resolution. Feel free to comment in the doc.)

chitrangpatel avatar Dec 21 '23 13:12 chitrangpatel

Hello 👋

I wanted to come back to this. Have we considered using PURL Spec for this? I think this could satisfy multiple use cases. When others write their custom resolvers, they could also parse this spec. It provides a generic schema that could work for a variety of resolvers instead of each one specifying their schema.

Today we have: git, hub, bundle and http resolvers which would fit nicely with PURL I think.

PURL also has support to parse the uri in multiple languages: https://github.com/package-url/purl-spec/tree/master so it should be possible to parse the string without much trouble as long as it satisfies the purl-spec.

chitrangpatel avatar Mar 06 '24 19:03 chitrangpatel

Thoughts @wlynch @chmouel @vdemeester ?

I would like to drive this forward.

chitrangpatel avatar Mar 06 '24 19:03 chitrangpatel

Superseded by TEP https://github.com/tektoncd/community/pull/1138

chitrangpatel avatar Mar 15 '24 17:03 chitrangpatel