layotto icon indicating copy to clipboard operation
layotto copied to clipboard

proposal: pubsub api add `delay` message support

Open kevinten10 opened this issue 2 years ago • 14 comments

Hi, 我在这里提供一些关于delay message的调研报告和初步设计,以便于后续相关api的设计.

What would you like to be added:

delay message in pubsub api.

Why is this needed:

https://github.com/mosn/layotto/discussions/612

Support situation:

PubSub Service Max Delay
AWS SQS 15min
AWS SNS ×
Active MQ 48H
Rabbit MQ 24H
Kafka ×
QMQ Year

从实际业务场景来看:

  • 毫秒级的delay没有什么意义,在内存中做就可以了;
  • min/hour级的应该是最常见的;
  • day级的应该也是常见的,因为用户的行为可能是需要天为维度的
  • month及以上意义不大

综上,可能最终支持到7day,是比较合适的时间跨度。

API spec:

use metadata, add key like DELAY_IN_SECONDS

What to do if it is not supported:

如果底层的PubSub组件不支持delay message或者时长不够时,初步想大概有几种思路:

1. reject or throw exception.

2. 外部系统

当在metadata中检测到包含DELAY_IN_SECONDSkey时,可以将这条消息{body, delaySeconds, rawTopic}等信息发送给类似死信队列的特殊topic,或者通过网络api的方式进行调用。

然后消费者可能是一个单独的周边服务,用户可以自行安装到集群中,类似:

namespace            pod                                                             
layotto-system      layotto-pubsub-delay-operator

然后这个服务监听特殊topic or 接口调用,并进行延时操作,时间到达之后再投递给真实的topic。

How to delay.

至于怎么进行延时操作,如果细致做的话不亚于做一个大型的分布式系统,可以参考开源MQ的相关实现。 如果简单做的话,比如SQS,可以delay 15min,然后循环delay,直到到达指定时间。

如果本身不支持delay的话,要么就报错,要么就考虑引入额外的依赖系统吧。


PS: 以上仅为个人的调研结果和想法,从API的设计上来讲比较简单,但复杂的是怎么支持这个功能。

kevinten10 avatar May 28 '22 16:05 kevinten10

Cool !

@azhsmesos Hi, will you continue working on delay queue API? This proposal is a good starting point

seeflood avatar May 29 '22 07:05 seeflood

Cool !

@azhsmesos Hi, will you continue working on delay queue API? This proposal is a good starting point

可以啊 ,要不我提个issue,然后assigned给我,然后我会设计一份提案出来

azhsmesos avatar Jun 06 '22 03:06 azhsmesos

@azhsmesos 不用再写新的提案啦,可以基于这个提案讨论,@kevinten10 之前根据这套方案在生产落地过

seeflood avatar Jun 08 '22 06:06 seeflood

  • 关于 API 变更: 同意🙆🏻‍♀️

  • 关于“如果底层的PubSub组件不支持delay message或者时长不够时,如何处理“ 这个就涉及 “feature 发现” 或者叫 “feature 协商”机制了 我看 dapr 在搞 capability API,app 运行时调 sidecar、判断能否满足需求,如果发现满足不了需求,就报错 个人更想在运维端做 feature 协商,比如通过 k8s operator 发现“底层的PubSub组件不支持delay message或者时长不够时”、无法部署。比如配置文件里,pubsub 组件配成这样:

                      "pub_subs": {
                        "pub_subs_demo": {
                          "delayMessage"{
                            "maxDelayInSeconds":"86400",
                          },
                          "type": "redis",
                          "metadata": {
                            "redisHost": "localhost:6380",
                            "redisPassword": ""
                          }
                        }
                      },

后面可以把这部分配置拆成 CRD,通过k8s 做"feature 协商"

  • 关于”如何在没有 delay queue 的云环境模拟 delay queue“ 这个有点麻烦,属于给 MQ 加新 feature,一期可以先不管这个吧?

seeflood avatar Jun 08 '22 08:06 seeflood

关于”如何在没有 delay queue 的云环境模拟 delay queue“ 这个有点麻烦,属于给 MQ 加新 feature,一期可以先不管这个吧?

Hi,我同意。因为我们这边有这个需求所以才做的,社区一期先忽略这个吧,可能后面再考虑。

kevinten10 avatar Jun 08 '22 09:06 kevinten10

社区会议讨论结果: feature 协商机制可以后面再搞,前期先通过写文档来提醒用户“这个组件支持/不支持 延迟消息”

不过这个 feature 想实现的话有个比较麻烦的地方,现在的pubsub 组件用的 dapr的,我们想加延迟消息的功能话,还得 fork dapr 组件库,有点麻烦。感觉可以先跟dapr社区聊下,看看能不能做到dapr 去? @kevinten10

seeflood avatar Jun 11 '22 14:06 seeflood

OK,需要我去dapr提个案讨论一下吗

kevinten10 avatar Jun 20 '22 03:06 kevinten10

action

  • [ ] 发提案 @kevinten10
  • [ ] 顶帖 @seeflood

seeflood avatar Jul 02 '22 12:07 seeflood

This issue has been automatically marked as stale because it has not had recent activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue or help wanted) or other activity occurs. Thank you for your contributions.

github-actions[bot] avatar Aug 02 '22 03:08 github-actions[bot]

最近比较忙,如果社区还没有提的话,我本周去dapr提案

kevinten10 avatar Aug 02 '22 13:08 kevinten10

@kevinten10 这个优先级不高(生产用户已经通过metadata加字段的方式用上了),你有空的话先跟下咱们之前聊的 https://github.com/mosn/layotto/issues/713#issuecomment-1189931691 如何,帮你们落地优先 :)

seeflood avatar Aug 02 '22 23:08 seeflood

This issue has been automatically marked as stale because it has not had recent activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue or help wanted) or other activity occurs. Thank you for your contributions.

github-actions[bot] avatar Sep 02 '22 03:09 github-actions[bot]

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as pinned, good first issue or help wanted. Thank you for your contributions.

github-actions[bot] avatar Sep 10 '22 03:09 github-actions[bot]

@kevinten10 Hi, I submitted a PR #786 to implement this proposal. Please help me review it :)

seeflood avatar Sep 15 '22 07:09 seeflood

Azure service bus also has this feature : https://learn.microsoft.com/en-us/azure/service-bus-messaging/message-sequencing#scheduled-messages

I updated your proposal.

seeflood avatar Sep 27 '22 09:09 seeflood