twemproxy PubSub in Redis

Hi,

do you plan providing pubsub feature for redis backend? I would like to use twemproxy but this feature is crucial for me.

BTW thanks for great project.

Jul 27 '13 07:07 mthenw

@mthenw could you describe the scenario in detail that you are attempting to solve with a pubsub support in twemproxy

Jul 29 '13 06:07 manjuraj

I'm using redis as a queue with persistence and I would like to scale (shard) it easily without using language dependent client libraries. On one side of queue I produce messages (multiple instances) by publishing on channel and pushing to list (history). On the second side I have consumers that process some of them and, from time to time, read history of them (from list).

Jul 30 '13 21:07 mthenw

Just to be clear, do I understand correctly that you are publishing messages by announcing them in a pub channel, and then adding them to a single redis list?

If so, twemproxy will not scale that for you with or without pub sub. All your items in a single key will still reside on a single shard. Twemproxy partitions keys based on their name, not their contents.

You could possibly change to storing a key/value for each item, then "announce" it by adding it to the left side of a list, while your consumer pop from the right side. The single list would of course reside in a single key, but the data themselves could be partitioned resulting in them being split among the N nodes in the cluster.

Ultimately, the question to be asked is how much data you are trying to solve for. Twemproxy or even Redis may not be the right tool for the job.

On Jul 30, 2013, at 16:14, Maciej Winnicki [email protected] wrote:

I'm using redis as a queue with persistence and I would like to scale (shard) it easily without using language dependent client libraries. On one side of queue I produce messages (multiple instances) by publishing on channel and pushing to list (history). On the second side I have consumers that process some of them and, from time to time, read history of them (from list).

— Reply to this email directly or view it on GitHub.

Jul 31 '13 05:07 therealbill

I wasn't clear enough :) I'm completely aware that I need to use multiple lists and multiple channels if I want to scale it among multiple servers. I also know that I could use other tools (probably RabbitMQ, Apache Kafka) but I would like to do that with Redis.

Jul 31 '13 05:07 mthenw

I've thought about this a little bit, and here's more or less the reasoning/conclusions I came to.

SUBSCRIBE requires an open active connection to redis to get channel messages. This sort of breaks twemproxy's "one connection per redis instance" groove.

You can get around this (sort of), by having twemproxy PSUBSCRIBE * to each of it's connected instances, and manage subscriptions itself locally based off of parsing incoming messages.

From here, it's pretty straightforward, you can route PUBLISH messages to their respective nodes using the usual hashing method. twemproxy could look at received messages and then route them to the correct subscribers.

However, if you step back and think about it a bit, this isn't really saving you anything. You're introducing a lot of overhead to pretty much just spread out which redis instances are dealing with the underlying channels, but those underlying channels are just pointed back at twemproxy, which would then handle directing it to the right subscribers -- it's all just overhead, at least with my naive implementation. You'd be better off just having another redis instance spun up whose sole job it is to deal with pub/sub.

It's a little less elegant, but it's easy.

Aug 07 '13 16:08 tejacques

Well, we have a setup where each application server (meaning a server that runs many instances of our application) has it's own, local, twemproxy instance that proxies to a shared pool of Redis servers. With pubsub support in twemproxy as you described we could easily leverage Redis' pubsub functionality for some lightweight message passing in a highly available manner. Setting up a separate Redis instance would introduce new availability problems that twemproxy solves for us, currently (unless we setup Redis cluster of course).

If it doesn't impact twemproxy performance for regular key/value traffic it would be a nice-to-have feature.

Oct 02 '13 12:10 martijn

Variables

S = Number of subscribing connections P = Number of published messages N = Number of twemproxy/Nutcracker instances R = Number of redis instances A = Number of application servers F = Number of redis instance up/down notifications

Single Redis Instance Overhead

If you use a single redis instance for all of your pub/sub notifications, you could think of the performance overhead as O(S*P). That obviously isn't great, but it's honestly not much different than the Naive pub/sub on twemproxy I described.

Naive Pub/Sub Overhead

The naive implementation I proposed doesn't scale well either. The overhead on twemproxy is O(R+S/N_P). That isn't great because you still have to handle every published message. The overhead on each redis instance is great O(S/R_P/R), but adding a new redis instance gets you nothing here for scalability because twemproxy will be the clear bottleneck (in fact it adds more load to each twemproxy instance).

Depending on your pub/sub load, this could have a significant impact on twemproxy's performance. Hashing and handling requests already puts twemproxy at a higher CPU load than redis instances even with a single instance connected. My best guess is that the load from handling a published message per subscriber will be about the same as handling a key hash.

Due to these factors, I think it's unlikely this naive implementation of Pub/Sub would make it upstream. I'd still argue that this is worse than the single redis instance approach because of the rather excessive overhead I believe will be added to each twemproxy instance. You do get high availability, but it's at a great cost.

A More Scalable Approach

A more scalable solution would be to add a subscribe functionality to twemproxy to get node up/down notifications, and to be able to ask twemproxy for what redis instance to connect to based on a key. Then if/when that connection dies or comes back to life, you would get a notification and could invalidate and re-ask Nutcracker for a new one.

That way twemproxy has a subscriber per application server connected to it (in your case 1) giving it O(A/N_F) (which should be essentially no overhead because F will be very small), and each redis instance would again have O(S/R_P/R) overhead, which scales perfectly.

The downside here is that it necessitates a client-side implementation to handle up/downed node reconnection and makes it so twemproxy is less clear about it's drop-in replacement capability. I don't think this implementation would ever make it upsteam into twemproxy because of that.

Oct 07 '13 01:10 tejacques

The last approach (scalable approach) proposed by @tejacques makes sense and might become a viable solution given the constraints of pub-sub. Anyone wants to take a stab at solving pub-sub in twemproxy as a fun experimental project? :) I can create a branch twemproxy_pubsub for this experimental feature.

Nov 06 '13 19:11 manjuraj

Any update on this front?

Mar 31 '16 21:03 TMiNus

Well, I also need this as @mthenw, coz of Spring Session Redis.The Session Event feture requires PUB/SUB function from redis.

Nov 23 '17 15:11 ryanhuang2004

I believe someone resurrected this thread then deleted their comment.

For the benefit of anyone running into this in the future, Redis cluster with the stream data type works the way I described earlier in the thread, see the following for additional information:

https://github.com/antirez/redis/issues/2672 https://redis.io/topics/streams-intro

May 03 '20 18:05 tejacques

twemproxy
twemproxy copied to clipboard

PubSub in Redis - any plans?

Variables

Single Redis Instance Overhead

Naive Pub/Sub Overhead

A More Scalable Approach

twemproxy twemproxy copied to clipboard

PubSub in Redis - any plans?

Variables

Single Redis Instance Overhead

Naive Pub/Sub Overhead

A More Scalable Approach

twemproxy
twemproxy copied to clipboard