triggers icon indicating copy to clipboard operation
triggers copied to clipboard

Support trigger via CloudEvents

Open afrittoli opened this issue 2 years ago • 10 comments

Feature request

Support CloudEvents compliant responses.

I see a few options:

  • When a cloudevent is received, respond with a cloudevent. The current body could be placed in the payload.
  • When a cloudevent is received, respond with no payload
  • Allow users to disable the response payload platform wise

Use case

When the project was started, it was decided to support generic HTTP requests with JSON payload rather than focus specifically on CloudEvents. The current behaviour of triggers is not compliant with the CloudEvents spec though, which makes it hard to use Triggers as a sink for CloudEvents.

When using triggers with Knative broker, the broker checks if the response is a cloud event, or empty, and reports a 502 back otherwise. See the code and the logs from a test cluster:

{"level":"debug","ts":"2022-09-01T10:59:16.299Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:173","msg":"Received message","commit":"56edb3b","triggerRef":"default/all-to-tekton-el-events"}
{"level":"debug","ts":"2022-09-01T10:59:16.299Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:353","msg":"Applying attributes filter.","commit":"56edb3b","trigger":"default/all-to-tekton-el-events","filter":{}}
{"level":"debug","ts":"2022-09-01T10:59:16.350Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:222","msg":"Successfully dispatched message","commit":"56edb3b","target":"http://el-events.default.svc.cluster.local:8080"}
{"level":"error","ts":"2022-09-01T10:59:16.350Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:227","msg":"failed to write response","commit":"56edb3b","error":"received a non-empty response not recognized as CloudEvent. The response MUST be either empty or a valid CloudEvent","stacktrace":"knative.dev/eventing/pkg/broker/filter.(*Handler).send\n\tknative.dev/eventing/pkg/broker/filter/filter_handler.go:227\nknative.dev/eventing/pkg/broker/filter.(*Handler).ServeHTTP\n\tknative.dev/eventing/pkg/broker/filter/filter_handler.go:209\ngo.opencensus.io/plugin/ochttp.(*Handler).ServeHTTP\n\[email protected]/plugin/ochttp/server.go:92\nknative.dev/pkg/network/handlers.(*Drainer).ServeHTTP\n\tknative.dev/[email protected]/network/handlers/drain.go:110\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2879\nnet/http.(*conn).serve\n\tnet/http/server.go:1930"}

afrittoli avatar Sep 01 '22 12:09 afrittoli

We can probably add a cloud event response type behind a opt in flag?

dibyom avatar Sep 01 '22 18:09 dibyom

We can probably add a cloud event response type behind a opt in flag?

It sounds good, but I think it should be only for incoming CloudEvents.

afrittoli avatar Sep 01 '22 19:09 afrittoli

Are you saying we return the current response if the input is not a cloud event? I was thinking we could eventually stabilize on always returning a cloud event?

dibyom avatar Sep 01 '22 20:09 dibyom

Are you saying we return the current response if the input is not a cloud event? I was thinking we could eventually stabilize on always returning a cloud event?

I guess we could... I'm not sure if there is any spec about expected replies for a webhook. If we use CloudEvents binary format, the body would not change at all, only HTTP headers would be added... so I guess we could send that to all eventually.

afrittoli avatar Sep 01 '22 20:09 afrittoli

Yeah, there are no expected replies as far as I know (other than the HTTP status code)

dibyom avatar Sep 14 '22 16:09 dibyom

Just to add some context/motivation. This is especially an issue in combination with knatives retry function (https://knative.dev/docs/eventing/event-delivery/). Because of the unexpected response, it is treated as a failure and retried. The retry results in duplicated resources created by triggers.

Fabian-K avatar Sep 23 '22 09:09 Fabian-K

I ran into this too, and I had a look at the code...

https://github.com/tektoncd/triggers/blob/main/pkg/sink/sink.go#L211-L223

We could detect CloudEvent requests there, and either write the response as JSON, or, as a CloudEvent message

bigkevmcd avatar Oct 05 '22 08:10 bigkevmcd

I think the response is JSON today, but there are no CloudEvents headers, so it's rejected by the broker

afrittoli avatar Oct 05 '22 08:10 afrittoli

That's why we'd write it as a CloudEvent message (using the cloudevents code) which would set the response correctly.

bigkevmcd avatar Oct 05 '22 09:10 bigkevmcd

I had a crack at this earlier this morning https://github.com/tektoncd/triggers/compare/main...bigkevmcd:triggers:cloudevents-triggers it needs a bit of tidying but the test passes.

bigkevmcd avatar Oct 05 '22 10:10 bigkevmcd

I make extensive use of triggering pipelines via cloudevents delivered from knative... so that works... but I'm forced to disable retry since knative thinks all the responses are bogus regardless of the 202 since it responds with JSON that isn't a CE.

Doing any of the things in the original topic here works for me, even if guarded by a feature flag.

My old discussion on the subject: https://knative.slack.com/archives/C017X0PFC0P/p1652901710903979

treyhyde avatar Oct 27 '22 01:10 treyhyde

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale with a justification. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot avatar Jan 25 '23 02:01 tekton-robot

/remove-lifecycle stale

daper avatar Jan 25 '23 09:01 daper

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale with a justification. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot avatar Apr 25 '23 09:04 tekton-robot

/remove-lifecycle stale

daper avatar Apr 25 '23 10:04 daper

@dibyom i think this has been implemented, right?

afrittoli avatar Apr 25 '23 11:04 afrittoli

There is a implementation related to slack interceptor which also supports http form data as part of this https://github.com/tektoncd/triggers/pull/1548 which is released in Triggers v0.24

@afrittoli does that PR helps ?

savitaashture avatar May 30 '23 04:05 savitaashture

Thanks @savitaashture - that's a nice feature to have, but I don't think it would help with CloudEvents specifically. The problem with CloudEvents is that the specification mandates the reply to be compliant with the CloudEvents specification.

I believe that has been addressed and solved in https://github.com/tektoncd/triggers/pull/1469

afrittoli avatar May 30 '23 10:05 afrittoli

Yes, it has been resolved by #1469 by following 1st approach. I don't see a strong need for disabling the response.

khrm avatar May 30 '23 13:05 khrm