DubSub icon indicating copy to clipboard operation
DubSub copied to clipboard

Add durability using eventsourced

Open tlvenn opened this issue 11 years ago • 3 comments

Hi Alex,

I was wondering if EventSourced ( https://github.com/eligosource/eventsourced ) could be used to further enhance DubSub, most notably, to provide durability when needed. It would allow to support topic with guaranteed delivery among other things.

tlvenn avatar Apr 25 '13 03:04 tlvenn

Hello,

Yes, I think that EventSourced could be used to enhance DubSub - but you would have to be careful about where you do so (as not to replay state that would be gathered by the cluster anyway).

I think that one needs to consider two separate parts for durability of DubSub:

  1. Subscription information
  2. Publish messages

For the Subscription information being distributed around the cluster, currently this is done without any durability, but I am considering migrating this to a gossip protocol style of dissemination that would actually eliminate any need to use EventSourced to ensure delivery. This method might incur a slight delay ( < 1 sec) but would help with scaling out to hundreds of nodes.

Unlike subscriptions, Publish messages could be enhanced with EventSourced to ensure guaranteed at-least-once delivery. This of course will incur some performance overhead, but it may be a requirement to have this reliability and it would be exteremly useful if you could turn it on/off in different cases.

Another thing that's worth mentioning is restarting actors (or systems even) and replaying the state with EventSourced so that they then hold the same subscription information would probably be the wrong thing to do as the subscribers of that node would be aware of the disconnection and so should subscribe again (albeit to the loss of any messages that were sent in the meantime). It really depends on what your systems architecture looks like, but for the typical case, I think that this is transient information. Likewise, if a system is restarted, should it try and send Publish messages that were not delivered? Depending on how long the system is down for, this could cause more problems for a particular application if the information is stale.

All things considered, I think it would make a great addition and I will definitely consider adding it in the near future.

Let me know your thoughts.

Thanks, Alex

alexanderjarvis avatar May 03 '13 17:05 alexanderjarvis

DubSub has just been upgraded with some reliability improvements that mitigate against the requirement of introducing external persistence and eventsourced (to some degree).

Changes in subscriptions are now disseminated around the cluster randomly and by default every 500ms - so even if the first time it fails because of network failure for example, it will update on the next round. Beforehand, the subscriptions were sent as soon as they were made on each node, to all other nodes, which could have lead to partial subscription information in the cluster.

Publishes are now buffered for 60 seconds so that if a subscription comes in that was made on another node before the time of the publish then it is sent to the subscribing node. This is a configurable option however and it can be turned off.

Publishes are currently not acknowledged and retried when they are being sent between nodes but this could be introduced to further increase reliability.

alexanderjarvis avatar Jun 02 '13 17:06 alexanderjarvis

Hi Alexander,

Sorry for the lack of feedback on your previous comment... Overall I totally agree with your comment and I welcome the upgrade you just pushed, nice work !

tlvenn avatar Jun 04 '13 03:06 tlvenn