eventing icon indicating copy to clipboard operation
eventing copied to clipboard

Decide on the future of the Broker and Trigger

Open devguyio opened this issue 5 years ago • 21 comments

Background After working with the Channel/Subscription models and gathering feedback from users, there was an agreement that the event consumers for the MVP usecases want to consume events based on the CloudEvents attributes without a knowledge of how it gets routed or produced (i.e. which channel they need to create a subscription for etc) which lead to the existinig Broker and Trigger model ( See #815 , #814 , #862 )

Problem After the initial basic implementation of the Broker and Trigger and collectiing feedback, it was clear that there are multiple shortcomings:

  • Broker requires unnecessary event deliveries #2288
  • Brokers persist messages unnecessarily #2287
  • Broker-to-broker event routing across namespaces #2050
  • Broker needs to be a blackbox and not expose its internal design (e.g. Channels)

Which clearly shows the Broker and Trigger need to evolve into a better model and/or design.

Exit Critirea There isi a clear set of improvments on the Broker and Trigger spec and/or implementation(s) which address the problems above and allows its wide adaption

devguyio avatar Jan 29 '20 02:01 devguyio

The absence of a definition of what a Broker is and does and how we evaluate if something is a Broker or not lead to @grantr working on Broker Conformance Spec

Issue #2306

devguyio avatar Jan 29 '20 02:01 devguyio

@n3wscott started a discussion around alternate Broker implementations with a proposal

Issue #2274

devguyio avatar Jan 29 '20 03:01 devguyio

Can we discuss this issue in the v1beta1 task force on Wednesday?

mikehelmick avatar Jan 29 '20 04:01 mikehelmick

IMO another short coming is that the trigger/filter is incredibly limited and covers only a small amount of use cases. That said there is currently no good way to "plug in" some custom filtering implementation.

This makes the user fallback to "traditional" (or stable) channel/subscription model...

matzew avatar Jan 29 '20 11:01 matzew

We should record the goals for broker such as that in-memory version is not for production use. And optimized implementations may not be using channels. Other changes?

aslom avatar Jan 30 '20 16:01 aslom

I've tried to compile an initial list of what should be there in order to use the current broker implementation in a production context for the project I am working on. Not all of these issues would apply to everyone and not all are completely mandatory, but hopefully this helps start a discussion. I think they are general enough that they shouldn't dictate a separate broker implementation though, unless the default broker will only be intended for POCs and development (like the in memory channel). Some of the points are repeated from above or have some related issues already (with both partial and complete solutions proposed). I would have created this as a google doc for easier collaboration but don't think I have access to create them in the knative community space there.

  • High throughput use cases
    • Ingress and filter must be horizontally scalable (#2461, KSVC would also be a good solution for us)
    • Ingress and filter must support resource requests and limit specification (#2321)
    • Ingress and filter must support additional annotations to configure aspects outside of knative eventing (e.g. istio annotations to define envoy CPU request overrides without changing full service mesh defaulting)
  • Efficiency
    • Current broker can not consume any events without re-persisting them. (#2287)
      • E.g. events are in a kafka topic already, they must now be read by a source, then sent to the broker ingress, re-persisted in a different kafka topic, and finally read for each trigger. Note that this problem is the same for all source + channel approaches and is not unique for the broker.
    • Current broker must read all events for each trigger. High volume and low volume event streams mixed in a single broker or a large number of different event types in a single broker are problematic cases here. (#2288)
      • To keep the broker as channel based, event partitioning strategies to use multiple channels could be introduced, possibly in a pluggable way. A list of some ways this could be used : Regex to support use cases where semantics of event types are well know, fixed count of types per channel to support generic use cases, and user specified grouping when characteristics such as event volume or requirements such as critical near real-time for an event type are known. (Regex or user specified would meet our initial requirements)
  • Security
    • Different consumers need to be authorized independently. All events in a broker can not be made uniformly available to all subscribers. (#2277)
  • Functional
    • Cross namespace delivery
      • Users may wish to isolate subscription workloads by namespace without having to send all events to each of those namespaces. It should not be required to create a new broker and re-persist all events in that namespace in order to consume them there. Allowing a namespace in the object ref of a trigger would help here. Authorization policies could be applied around trigger creation and namespace usage externally.

davyodom avatar Jan 31 '20 17:01 davyodom

IMO another short coming is that the trigger/filter is incredibly limited and covers only a small amount of use cases. That said there is currently no good way to "plug in" some custom filtering implementation.

This makes the user fallback to "traditional" (or stable) channel/subscription model...

#2275

akashrv avatar Feb 05 '20 21:02 akashrv

/assign

grantr avatar Feb 06 '20 01:02 grantr

The questions which seem to be floating around without clear consensus are:

  1. Should a default broker exist in eventing-core?
  2. Should this default broker have production-like qualities or should a different one be created in eventing-contrib?
  3. Should the prod-like broker be channel based and expand on the current impl in eventing?
  4. Is the full set of features/qualities in the comments above in scope?
  5. Should alternate broker implementations be supported? (I think this already has a 'yes' consensus, but just listing here in case there is still some debate)

davyodom avatar Feb 07 '20 16:02 davyodom

My 2 cents:

The questions which seem to be floating around without clear consensus are:

  1. Should a default broker exist in eventing-core?

[akashrv] Yes. Or at least part of default install (release yaml). A customer should not be forced into a vendor-written broker and should have a Knative OSS alternative.

  1. Should this default broker have production-like qualities or should a different one be created in eventing-contrib?
  2. Should the prod-like broker be channel based and expand on the current impl in eventing?

[akashrv] What does a "production-like broker mean" is debatable. Example certain delivery semantics could be transport specific, but we can always define some minimum spec for something to be called as Broker and #2306 tries to define this. So, IMO, if we have a Broker that is backed by channels (not saying the current implementation should be the one) and channels abstract out delivery semantics and guarantees, then we can have a default broker and an operator can decide to leverage channels to get different delivery semantics.

  1. Is the full set of features/qualities in the comments above in scope?
  2. Should alternate broker implementations be supported? (I think this already has a 'yes' consensus, but just listing here in case there is still some debate)

[akashrv:] Yes different vendors should be able to write their custom Broker that are conformant with with Broker spec. This could be due to variety of reasons such as leverage capabilities of vendor infra, or current channel based broker design doesn't work out.

akashrv avatar Feb 07 '20 19:02 akashrv

Regarding Triggers, Tekton implemented CEL-based triggers using the Knative filtering proposal from @grantr as a reference. FYI @wlynch

TristonianJones avatar Feb 28 '20 00:02 TristonianJones

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Nov 25 '20 01:11 github-actions[bot]

/reopen

aslom avatar Dec 06 '20 23:12 aslom

@devguyio is this still relevant? can we close it? It seems to me all the questions are pretty much addressed.

slinkydeveloper avatar Feb 11 '21 14:02 slinkydeveloper

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar May 13 '21 01:05 github-actions[bot]

I've been working on some issues related to this one and I want to have this on our roadmap .

/reopen /assign

devguyio avatar Dec 10 '21 12:12 devguyio

@devguyio: Reopened this issue.

In response to this:

I've been working on some issues related to this one and I want to have this on our roadmap .

/reopen /assign

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

knative-prow-robot avatar Dec 10 '21 12:12 knative-prow-robot

/remove-lifecycle stale

devguyio avatar Dec 10 '21 12:12 devguyio

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Mar 11 '22 01:03 github-actions[bot]

@pierDipi @lionelvillard FYI this got staled

devguyio avatar Apr 10 '22 01:04 devguyio

/remove-lifecycle stale

lionelvillard avatar Apr 11 '22 14:04 lionelvillard

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Aug 24 '22 01:08 github-actions[bot]