siren icon indicating copy to clipboard operation
siren copied to clipboard

Notification Source Idempotency

Open mabdh opened this issue 3 years ago • 0 comments

Problem

Siren as a Notification Service requires idempotency for its Notification Sources to avoid duplicated notification that would lead to alert/notification fatigue for its users.

With an assumption that the network is always unreliable, retry is always needed to make sure a data is being passed or a function is being invoked. When sending notification, retry could cause a duplicated notification if not handled properly. Duplicated notifications are unnecessary to send and too many alerts/notifications could degrade the user experience and could lead to alert fatigue to its users.

Solution

oip3-2

Above is the flow of notification in Siren. Notification Source should have idempotent property to avoid duplicated notification. Notification has two possible sources, via manually triggered API and via provider alert webhook. Both sources need to transform its model into a Notification model. Idempotency could be handled at this point before notification is being dispatched.

Idempotency Key

From the diagram above, the point of failure when sending notifications that we consider for idempotency is only the Notification Dispatcher step. Once the notification message is already in a queue, it is safe to say the transaction is finished. We could define a new variable Idempotency-Key that could be used to define a single unique notification.

Idempotency-Key is a string with length max 100 chars with no whitespace. There is no restriction on what the format should be but we could prefer it to be in UUID format. The idempotency is unique for a user, this means two Notifications might have the same idempotency key but we could still distinguish it by user. However since Siren currenlty does not have any knowledge about user (no auth), we can put API for notification triggered by API and Cortex if notification is triggered by cortex webhook.

Table notification_idempotency_keys

To make sure the idempotency check could be done horizontally, we need to store idempotency key information in a single place. Therefore, there is a need to create a new notification_idempotency_keys table as well in Siren's postgres.

Field name Field type Properties
id BIGSERIAL PRIMARY KEY
user TEXT NOT NULL
idempotency_key TEXT NOT NULL CHECK (char_length(idempotency_key) <= 100)
created_at TIMESTAMPTZ NOT NULL DEFAULT now()

To make sure the idempotency key is unique-per-user we can create a UNIQUE INDEX

CREATE UNIQUE INDEX notification_idempotency_keys_user_idempotency_key ON notification_idempotency_keys (user, idempotency_key);

Idempotency Keys Time-To-Live (TTL)

Since the duplicated notifications tend to be sent multiple times in the relatively short duration (it is less likely for a notification to be retried/resend after a long duration), the stored idempotency keys would not be stored forever. There is a predefined global time-to-live configuration for all idempotency keys e.g. 24 hours. This is to make sure the idempotency keys could be reused again after the pre-configured time.

Since postgresql does not have a row-wise TTL feature, there is a need for a job to do this regularly. Although this seems like a problem that could be solved by Redis or other storage that supports TTL, we could start with utilizing what we have currently and could optimize and improve later.

Notify API Idempotency

Notify API will be used to manually triggered notification. We could support idempotency by supporting Idempotency-Key header with string value and users could generate idempotency key (preferred in UUID) by themselves.

mabdh avatar Sep 23 '22 09:09 mabdh