proposals
proposals copied to clipboard
Notification system
Author: Loren
Summary of the feature being proposed
Add two capabilities:
- Allow subscriptions to raw Event History events via gRPC streaming
- Provide UI for configuring notification triggers (and the backend that stores and executes them):
- Flexible condition builder
- Flexible trigger action (webhook, email, sms, slack)
What value does this feature bring to Temporal?
Users want to be able to receive a push notification when something happens. For example, when a Workflow times out, they want to receive that event with information about the Workflow, like its ID and history.
The current methods of getting this are:
- A count metric (not helpful for knowing which workflow timed out):
- Open-source server metric: https://github.com/temporalio/temporal/blob/5ab8ab804fa3e0998f419260b4424154419428d9/common/metrics/defs.go#L2763
- Cloud metric:
temporal_cloud_v0_workflow_timeout_count
- Polling
ListWorkflowExecutions
for Executions in a Timed Out state (note: polling may take more steps and be more resource-intensive to do for other types of events)
Are you willing to implement this feature yourself?
Yes but I'd be pretty slow at the Go part 😄
This is much more complicated than it sounds and we should discuss this at a broader level. gRPC streaming is a bit hard to proxy properly in a truly HA environment. Also, lossless stream resumption across workflows would be fraught with state complication. Also, maintaining sinks to user endpoints that you push to from anywhere in the cluster when an event happens is rough from a state management POV. Also, there is no need for the UI or condition builder or flexible action or any of those other features for some kind of MVP.
Many users have a larger use case than what you are asking here: They want every Temporal event from every workflow or activity or even cloud events (but some of their use cases can be solved by export). It'd make little sense to have some single-workflow callback and ignore the ability to provide all Temporal events.
We need a general firehose that anyone read from, but again there are many complications involved. It should be a full project and not narrowly tailored. I have many more advanced implementation ideas for this and would be willing to submit a real proposal if requested. There has also been some discussion this in the past, so those points would need to be included.
Yes but I'd be pretty slow at the Go part
That's like the entire part :-)
👍 I meant the first as the firehose (with which someone could implement the second externally). Makes sense to include activity & cloud events as well. Product-wise, I think an MVP could either be the firehose or a small set of conditions with a webhook action.
We need a general firehose that anyone read from...I would be willing to submit a real proposal if requested.
A firehose of events that spans all Temporal workflows/activities would be very useful to us. If it's helpful motivation for writing the proposal, here are two use-cases that come to mind:
- Alerting on failed workflows. Currently, we run a CRON that loops across all namespaces with the CLI and looks for failed workflows. A firehose would give us a more effecient way to listen to events across all namespaces and would allow us to alert closer to real-time.
- Synchronizing Temporal state to other systems. We currently use interceptors to detect in real-time when a workflow/activity makes progress and then asynchronously push this out. This has complications as the state of the external systems can end up out-of-sync with Temporal.
Another use case: https://docs.prefect.io/ui/automations/
Could also add Cloud-specific events like namespace creation:
Expose ability to handle lifecycle events from our cloud. For example a company wants to run their own workflow when a user provisions a new namespace using Temporal cloud UI or tcld. Without this feature most companies would need to wrap namespace creation in their own UI/cli.