eventing icon indicating copy to clipboard operation
eventing copied to clipboard

Trigger are in inconsistent state with Channel

Open astelmashenko opened this issue 11 months ago • 4 comments

Describe the bug We are using knative eventing and serving. Oeverall it is working properly, but there is an issue happened and I'm not sure how to resolve it. It is related to MTBroker and Trigger. Channel implementation is eventing-natss. Let's consider setup:

  • Default broker
  • Around 20 triggers, [trigger1..triggerN]

We deleted several triggers trigger1..3, and at the same time webhook was not available (in some reason). Now I do not see those triggers if I get them with:

kubectl get trigger -n myns

Some triggers were deleted.

But I noticed errors in broker-filters that trigger1..3 are not found. Then I checked channel, in my case it is NatsJetstreamChannel and there is Subscribers (if I get yaml) with deleted Triggers present. It seems because of broken communication with webhook (just an assumption) there inconsistent state of triggers and channel behind the broker.

How can I make it consistent again, should I some how trigger re-sync of the broker/channel with actual triggers I have? Should I just edit NatsJetstreamChannel CRD, just remove non-existing subscribers.

Expected behavior MTBroker should re-sync channel with actual state.

To Reproduce Not sure how to do it.

Knative release version 1.16.x

astelmashenko avatar Mar 24 '25 15:03 astelmashenko

@astelmashenko I's ask on the repo that provides the nats channel - as they may have special logic for reconciling the triggers (e.g. like Kafka broker has code that runs the triggers for the broker, instead of from "core" knative)

matzew avatar Mar 26 '25 15:03 matzew

eventing-natss implements only channel/subscriptions. Here I'm talking about MTBroker, which is part of eventing project. So if there are triggers, eventing-natss controller does not know anything about triggers, because triggers is part of MTBroker (knative-eventing prject). And I was thinking about how is it possible MTBroker controller to re-sync it's triggers with underlying channel.

astelmashenko avatar Mar 26 '25 16:03 astelmashenko

Ah. sorry I misunderstood.

Can you share the logs from the eventing-controller?

Is this easy/always to reproduce? Also with the InMemoryChannel?

matzew avatar Mar 27 '25 17:03 matzew

I do not have logs, it happens from time to time, I do not know exact steps how to reproduce. We do not use InMemoryChannel. But when it happens we get deleted subscriptions (trigges) present in a channel. NatsJetStreamChannel does not know anything about triggers, because triggers are under MTBroker and it's controller. So we just see error logs that subscription does not exist when it tries to send to a broker-filter like

unable to complete request to http://broker-filter.knative-eventing.svc.cluster.local/triggers/customer1/core-crm-user-trigger/9574ca95-3d5d-4c4b-a9aa-dcb9f6922b02

What I think would help if there was a re-sunc functionality to somehow delete orphaned subscribers.

astelmashenko avatar Mar 28 '25 07:03 astelmashenko

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Jun 27 '25 01:06 github-actions[bot]

we observe this issue again. If I get channel resource: k get natsjetstreamchannel default-kne-trigger -n myns -o yaml and see how many subs are there it is in my case 74 when I get subs via kubectl to check existing CRDs kubectl get sub -n myns then I get 44 So there 30 orphaned subscribers. Not where to check that, is it on eventing-natss channel implementation or on knative-eventing broker?

astelmashenko avatar Jul 24 '25 14:07 astelmashenko

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Oct 24 '25 01:10 github-actions[bot]