eventing
eventing copied to clipboard
[Experimental] Object reference resolver to support non-Addressable objects
Description
This proposal is about extending the KReference resolver to support non-addressable objects (no status.address
field). For those objects, we propose to ~delegate the reference resolution to an external service which upon receiving a KReference object replies with a resolved URI or an "not supported" error~ allow people to define a mapping between "Kind" and "URL-Template", where the URL-Template can use fields from the Object being referenced.
Use Case
The driving use case is this one: for instance, as a user I want to be able to trigger a batch job every day by having a PingSource CR targeting a batch job definition:
IMPORTANT NOTE: this is just an example! The source can be anything (Github, Kafka, you name it) and JobDefinitition can also be anything that is not Addressable.
apiVersion: my.batch.job/v1alpha1
kind: JobDefinition
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
---
apiVersion: sources.knative.dev/v1
kind: PingSource
metadata:
name: trigger-job
spec:
schedule: "@daily"
sink:
ref:
apiVersion: my.batch.job/v1alpha1
kind: JobDefinition
name: pi
Exit Criteria Users are now able to reference non-addressable object
Experimental flag name: kreference-custom-resolvers
Experimental feature stages plan
Below the proposed plan for the feature stages (this list implicitly includes the requirements defined in the process)
- Alpha:
- [x] Extend
URIFromObjectReference
to support custom resolvers in knative/pkg - [x] Add kreference-mapping experimental flag
- [x] Add a new configmap (kreference-mapping.yaml) to configure mappings between "Kind" and "URL-Template"
- [x] Implement kind -> URL mapping URI resolver
- [x] PingSource uses the new custom resolver
- [ ] Track reference
- [ ] All controllers relying on sink resolver uses the new custom resolver
- [ ] User documentation
- [x] Extend
- Beta graduation as soon as 1 release after the inception
- Beta:
- [ ] User documentation stabilization and improvements
- [ ] More e2e tests
- [ ] Add conformance tests
- Stable graduation as as soon as 2 releases after the Beta graduation
- Stable:
- [ ] Add the requirement to the spec
Affected WG
- Eventing WG
- Networking WG
/cc @duglin @cr22rc @n3wscott
Questions, since I missed the presentation:
-
How does this work with ContainerSource / SinkBinding, which today takes a
K_SINK
URL? -
Is there a mechanism to re-trigger the address resolution if the backend mapping of the resource -> URL changes?
-
The given example is a custom resource, which seems like it could implement
status.address
fairly easily. Are there compelling use cases where storing this information in a resource (eitherJobDefinition
orEventTriggeredJob
which has a ref to aJobDefinition
and acts as an adapter) doesn't work? -
Is there a Feature Track doc that explains all this, including how it's rolled out across in-tree and out-of-tree components, which might be in multiple languages (and may not be released on the Knative release cadence or using Knative libraries)?
All, thanks for the chat on today's WG call. A couple of additional thoughts:
- obviously this isn't just about the PingSource, it's just a simple example. Any Source could need to be connected to some non-duck-typed-Addressable resource as its Sink
- this also doesn't just apply to Sources, Triggers would fall into the same problem space. Making this change in
pkg
would help make support for this possible with minimal changes. But, of course, any Source that's not leveragingpkg
would need it's own custom support for this - but that's no different than them needing to special case K8s Services today. - while this issue talks about a "reference resolver service", it's also possible that we could look at other mechanisms to support this extensibility. For example, perhaps we have some kind of ConfigMap that allows people to define a mapping between "Kind" and "URL-Template", where the URL-Template can use fields from the Object being referenced. For example,
batchJob.example.com/v1 -> https://jobrunner.system-ns.svc.cluster.local?uid={{.metadata.uid}}
. Or support both, a mapping and a service. We don't need to decide this now, I think that's part of the brainstorming/experimentation phase, once we get past the "do we want to do this?" phase. - other things we'll need to discuss:
- security of the service to ensure only allowed components can call it
- how to deal with the resolvable service (or mapping configmap) changing the URL of the related resources
- how to deal with the resolvable service itself moving to a new location
- other possible usecases include:
- event -> buildDefinitinon (e.g. Shipwright build definition)
- event -> workflowDefinition
Thanks @evankanderson for looking at this. Answers:
- The same way as today: the
K_SINK
URL is resolved by the SinkBinding controller which, under the cover, relies onURIFromObjectReference
. - No.
- It's not easy if the custom resource is out of our control and owned by a different community. The next logical step is indeed to have an adapter (
EventTriggeredJob
(or a parameterizable service)), which is perfectly fine but from an UX perspective might not be ideal. Instead of "a reference to a reference to an object", what about letting users referencing directly the object? - I'm not envisioning Knative to provide a reference implementation of an external kreference resolver. This proposal is just about providing an extension point.
- Disclaimer: I'm 100% fine adding this as an experimental feature which is our vehicle for innovation with safety (P.S. according to our experimental feature you can already go ahead and provide an implementation as part of the buy-in phase and before kicking in the lazy consensus phase)
- From what I understood, this is about providing a user experience that basically allows users to specify any arbitrary resource as a
Sink
or aSubscriber
given that such resource exposes a webhook at an arbitrary URI and does not necessarily expose such URI in its status. - So basically, this is asking to change the sources spec so that it's not limited to
Addressable
sinks. - One goal of eventing and Knative is to preach some of its concepts all over the K8s community which we believe:
- Provide a seamless UX with less cognitive load.
- Are more intuitive and easy to reason about.
- Facilitate better interops and integration between heterogeneous k8s applications.
- One of those concepts is
Addressable
resources, a core building block and concept of Event Driven Architectures using Knative. - I worry that if we allow non-addressables, people would default to those and we'd lose any edge that comes with the Knative adoption to preach addressable resources, but one might argue the addressable restriction is an adoption barrier in its own. So it's a chicken or egg problem. I don't have an answer to that personally.
- One solution that I was hoping our vendors would embrace, would be for the platform to provide middleware automation that gives their users a smoother UX while still adheres to and preaches Knative concepts.
- Based on that, I wonder if instead of totally changing the spec to allow non-addressables, we can introduce a new concept of "addressable adapters" or "sink adapters", which means a source sink can be:
- an addressable Ref
- a URI
- an addressable/sink adapter that refers in turn to a non-addressable resource and vendors can provide their own adapter controllers while providing their users with a very similar UX as the one proposed
apiVersion: my.batch.job/v1alpha1
kind: JobDefinition
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
---
apiVersion: sources.knative.dev/v1
kind: PingSource
metadata:
name: trigger-job
spec:
schedule: "@daily"
sink:
adapter:
apiVersion: my.batch.job/v1alpha1
kind: JobDefinition
name: pi
- I can imagine there can be many ways to achieve the above, maybe a generic lower-level CRD such as
SinkAdapter
with class based controllers similar toBroker
orIngress
and the default class being configurable cluster wide? - Such an approach IMHO would still preach the addressable sinks while providing users of our vendors with the needed smooth UX.
again, as I said I'm all for leveraging the experimental feature process to give this a shot, I'm just trying to surface some of our design goals we're trying to stick to and entertain other alternatives that provide needed UX.
I agree with @devguyio's sentiment. It feels like what you want is a resource in between, that can then implement whichever semantics you like (in your case calling some external service to fetch the URL the source should sink into) while also implementing the Addressable
interface. Has that been considered?
I haven't yet read fully through the issue, but my first question about the use case given:
-
Why not using a Kubernetes
CronJob
? (and just out of curiosity, theJobDefinition
resource looks very much like a standard K8sJob
, is there any particular reason to not use a standardJob
?) -
Besides the PingSource + custom Job use case (which I believe can be solved with standard K8s means) is there any other concrete use case (@duglin mentioned that the given use case is only one of many) ?
other possible usecases include:
- event -> buildDefinitinon (e.g. Shipwright build definition)
- event -> workflowDefinition
This is where we should try to influence the respective community to implement the Addressable
schema in order to play nicely with Knative (and without requiring a custom resolver in-between which always looks like a duct tape approach)
- When you have control over your custom
JobDefinition
resource, what prevents you from implementing theAddressable
interface for it ?
Side note to CronJob
vs. PingSource
: Are we sure that our PingSource is hardened enough for edge cases ? Funnily enough, my very first Open Source project was https://metacpan.org/pod/Schedule::Cron which I supported for more than ten years (not anymore), and there were tons of edge cases that people found out and that needed to be fixed. Especially dealing with daylight saving switches is tricky and even in some situations non-deterministic (e.g. when turning back the hour from 3 to 2 and you have job scheduled for 2:30, when do you fire the job ? The "first" 2:30 ? The "second" 2:30 ? Both ?).
Still trying to find out the real use cases. Because when the real use case is to use PingSource as a full-featured cron-scheduler, then I would recommend that the PingSource should be checked if it is suitable for production (like checking those edge cases like those mentioned in https://metacpan.org/release/ROLAND/Schedule-Cron-1.01/source/t/execution_time.t)
If using CronJob
there are potentially many more users and maintainers for fixing such issues.
Thanks guys for looking at this.
@devguyio @markusthoemmes: there is an intermediate data-plane object, but no corresponding CRD. See my comment above about double reffing.
@devguyio Agree with all you points expect:
- The reference does directly expose a webhook. The intermediate data-plane object does.
-
SinkAdapter
CRD since that's exactly what we are trying to avoid
@rhuss looping @cr22rc.
there is an intermediate data-plane object, but no corresponding CRD
@lionelvillard ack. that's my understanding as well.
The reference does directly expose a webhook. The intermediate data-plane object does.
I also understood the same, and it's what I meant by specify any arbitrary resource as a Sinkor a Subscriber given that such resource exposes a webhook at an arbitrary URI
SinkAdapter CRD since that's exactly what we are trying to avoid
My understanding is that you are trying to avoid making the user use an adapter CRD, right? and the example I gave above will do exactly that. The user doesn't have to know anything about a SinkAdapter CRD. The ´platform´ is the one that'll create that CRD for the user under the hood. This is basically the resolver service you're asking for but using k8s API (the CR is the request to resolve, and the CR status is the response with the URI) . There's no extra adapter data plane in this model.
@devguyio this is an interesting solution. I don't really like the idea of changing the Destination API. ref
is my mind is just fine.
I really like @dug idea about the configmap that allows people to define a mapping between "Kind" and "URL-Template". Easy to implement, no external dependencies, no extra network hop (one of Scott's concerns). It also addresses @evankanderson question/concern about re-triggering the address resolution, which now becomes technically feasible.
I just realized this has an impact on DomainMapping /cc @julz
If we aren't as worried about an extra network hop, it also seems feasible to create a service which uses URLs containing k8s kind
+ name
exposed in the Destination along with a special "resolver" object ref
, where the resolver contains both the naming oracle and a proxy (or define 3xx behavior for Eventing as something other than a retry).
Note that the key behavior we're depending on for Destination (the thing that references a K8s Service or Addressable) is that the ref
ed object will have a Watch-able change when the URL for a resource changes. (Technically, we have two of these watchable changes, one of of which is the Kubernetes DNS service, which provides a short-TTL DNS record, and one of these is the Addressable/Service abstraction).
@evankanderson The current approach (kind->URL mapping in ConfigMap) preserves this key behavior:
- When the referenced object is changed, a new URL is potentially generated since any objects can be tracked (modulo RBAC, not implemented in #5599)
- When the ConfigMap containing the mapping Kind->URL is changed, all objects containing
ref
s can be re-reconciled. (not implemented in #5599)
Just to speak it out loud (re @devguyio's suggestion): You could install a webhook that leaves the API intact and does the translation for you
ref:
apiVersion: my.batch.job/v1alpha1
kind: JobDefinition
name: pi
becomes
ref:
apiVersion: adapters.com/v1alpha1
kind: MySinkAdapter
name: pi
automatically and all interfaces can stay intact. That'd leave the interface to be purely K8s API based, which has some level of beauty to it, imo anyway.
@markusthoemmes that's an interesting solution. However I don't like the idea of having the webhook changing the spec, too confusing. It's fine for the webhook to populate the defaults but nothing more than that IMO. It also makes exporting applied resources to yaml non-trivial.
all interfaces can stay intact.
that's nice to have but not a requirement.
That'd leave the interface to be purely K8s API based,
k8s or knative? ref is k8s, Addressable is Knative.
FWIW, the resolved sink appears in the status section so it's clear where events are sent
I'm not quite parsing in which way that's more confusing vs. having the resolution been done by some service that the user can't even inspect. In the outlined approach, the user would see exactly what's happening and can check the respective adapter to see what the actually resolved URL was (aka: Where their stuff is sent).
This is similar behavior to what istio sidecar-injection is doing as well too, so we're not out of bounds wrt. changing spec on apply. After all, the response the user gets from the apply would contain the mutations.
having the resolution been done by some service that the user can't even inspect.
we moved away for this solution. Sorry I didn't update the issue description (will do).
the user would see exactly what's happening and can check the respective adapter to see what the actually resolved URL was (aka: Where their stuff is sent).
This is still the case with the mapping solution (see above): the user sees exactly where events are sent by looking at the resolved URL in the status section. The adapter may or may not send events. In the example above, the adapter creates jobs.
This is similar behavior to what istio sidecar-injection is doing as well too, so we're not out of bounds wrt. changing spec on apply. After all, the response the user gets from the apply would contain the mutations.
Is this just adding stuff or also changing the original spec?
Anyhow, one of the goals here is to hide (for the lack of a better word) the inner working. The user does not care about the adapter, that's just plumbing.
Just to make sure I understand, in the JobDefinition example, is it the case that if we wrote status.address.url to the JobDefinition in a controller (i.e. made it implement Addressable) things would "just work" without needing this? Putting aside whether doing that is a good idea or not, are there examples where something like that (or indeed Evan's suggestion of an EventingJobDefinition with the same semantics as JobDefinition but that implements Addressable) wouldnt fulfill the use case?
Just to make sure I understand, in the JobDefinition example, is it the case that if we wrote status.address.url to the JobDefinition in a controller (i.e. made it implement Addressable) things would "just work" without needing this?
yes
Putting aside whether doing that is a good idea or not, are there examples where something like that (or indeed Evan's suggestion of an EventingJobDefinition with the same semantics as JobDefinition but that implements Addressable) wouldnt fulfill the use case?
Yes for the use case and no for the desired UX. We would like to hide (in the spec, not in the status) EventingJobDefinition.
no for the desired UX
Just to push on this for a moment, in what way is the UX different in this approach? If we added a controller - or indeed a webhook - which placed status.address on JobDefinition (which I think we could do, and which could even be driven by a configmap), wouldn't the UX be identical, other than - to @markusthoemmes's point - the user could predict which URL would be used by looking at the JobDefinition, without having to wait for the sink's status to be resolved?
which placed status.address on JobDefinition (which I think we could do, ...
That's an interesting idea. It's not always possible though, since it's possible to define closed schema.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/triage-accepted
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/remove-lifecycle stale
/triage accepted