fred.rs Connection management issues

Hi - Thank you for a great library!

We have been experiencing some connection management issues (including using latest version 5.2.0). Using fred as a redis client in a kubernetes cluster (redis is self-hosted in k8s) - if the redis instance is moved to another node, we end up in a corrupted state with connection management.

This can simply be reproduced doing kubectl scale deploy/redis --replicas 0.

We have tested with/without pipelining. With/without connection pool. Config is using exponential reconnect policy + command timeout. This is a trace of the events.

Initially we get a connection - everything is fine:

 DEBUG fred::multiplexer::commands > fred-JUKIBXKdhV: Initializing connections...
 TRACE fred::protocol::types       > fred-JUKIBXKdhV: Using 10.0.78.183 among 1 possible socket addresses for redis:6379
 TRACE fred::multiplexer::utils    > fred-JUKIBXKdhV: Connecting to 10.0.78.183:6379
 TRACE mio::poll                   > registering event source with poller: token=Token(0), interests=READABLE | WRITABLE
 DEBUG fred::protocol::connection  > fred-JUKIBXKdhV: Skip setting client name.
 TRACE redis_protocol::utils       > allocating more, len: 0, amt: 24
 TRACE fred::protocol::codec       > fred-JUKIBXKdhV: Encoded 24 bytes to 10.0.78.183:6379. Buffer len: 24 (RESP2)
 TRACE tokio_util::codec::framed_impl > flushing framed transport
 TRACE tokio_util::codec::framed_impl > writing; remaining=24
 TRACE tokio_util::codec::framed_impl > framed transport flushed
 TRACE tokio_util::codec::framed_impl > attempting to decode a frame
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Recv 4 bytes from 10.0.78.183:6379 (RESP2).
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Parsed 4 bytes from 10.0.78.183:6379
 TRACE tokio_util::codec::framed_impl > frame decoded from buffer
 DEBUG fred::protocol::connection     > fred-JUKIBXKdhV: Read client ID: Resp2(Integer(7))
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Creating new close tx sender.
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Set centralized connection closed sender.
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Emitting connect message.

Now we scale down the deployment:

kubectl scale deploy/redis --replicas 0

Fred realizes that the connection is lost:

 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Starting command stream...
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Set inner connection closed sender.
 TRACE tokio_util::codec::framed_impl > attempting to decode a frame
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Recv 0 bytes from 10.0.78.183:6379 (RESP2).
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Recv 0 bytes from 10.0.78.183:6379 (RESP2).
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Redis frame stream closed with error Redis Error - kind: Canceled, details: Canceled.
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Emit connection closed from error: Redis Error - kind: Canceled, details: Canceled.
 TRACE fred::multiplexer::utils       > fred-JUKIBXKdhV: Emitting connection closed with 0 messages
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Recv reconnect message with 0 commands. State: Disconnected
 INFO  fred::multiplexer::commands    > fred-JUKIBXKdhV: Sleeping for 132 ms before reconnecting
 TRACE fred::protocol::types          > fred-JUKIBXKdhV: Using 10.0.78.183 among 1 possible socket addresses for redis:6379
 TRACE fred::multiplexer::utils       > fred-JUKIBXKdhV: Connecting to 10.0.78.183:6379
 TRACE mio::poll                      > registering event source with poller: token=Token(1), interests=READABLE | WRITABLE

Notice at this time there are no pods running redis, so the reconnect will just hang. Now we attempt to execute a command against redis, using a client from the connection pool:

 TRACE fred::multiplexer::commands    > fred-JUKIBXKdhV: Recv command on multiplexer SET. Buffer len: 0
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Will block multiplexer loop waiting on SET to finish.
 TRACE fred::multiplexer              > fred-JUKIBXKdhV: Skip waiting on cluster sync.
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Writing command SET to redis:6379
 TRACE fred::protocol::connection     > fred-JUKIBXKdhV: Sending command and flushing the sink.
 TRACE redis_protocol::utils          > allocating more, len: 0, amt: 40
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Encoded 40 bytes to 10.0.78.183:6379. Buffer len: 40 (RESP2)
 TRACE tokio_util::codec::framed_impl > flushing framed transport
 TRACE tokio_util::codec::framed_impl > writing; remaining=40
 WARN  fred::multiplexer::commands    > fred-JUKIBXKdhV: Error writing command None: Redis Error - kind: IO, details: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Reconnecting or stopping due to error: Redis Error - kind: IO, details: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Emitting close all sockets message: Redis Error - kind: IO, details: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
 WARN  fred::multiplexer::utils       > fred-JUKIBXKdhV: Error sending close message to socket streams: SendError(Redis Error - kind: IO, details: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" })
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Waiting for client to reconnect...
resp Err(Redis Error - kind: Timeout, details: Request timed out.)

Next we scale up redis:

kubectl scale deploy/redis --replicas 1

After a while the connection pool will reconnect:

 DEBUG fred::protocol::connection     > fred-JUKIBXKdhV: Skip setting client name.
 TRACE redis_protocol::utils          > allocating more, len: 0, amt: 24
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Encoded 24 bytes to 10.0.78.183:6379. Buffer len: 24 (RESP2)
 TRACE tokio_util::codec::framed_impl > flushing framed transport
 TRACE tokio_util::codec::framed_impl > writing; remaining=24
 TRACE tokio_util::codec::framed_impl > framed transport flushed
 TRACE tokio_util::codec::framed_impl > attempting to decode a frame
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Recv 4 bytes from 10.0.78.183:6379 (RESP2).
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Parsed 4 bytes from 10.0.78.183:6379
 TRACE tokio_util::codec::framed_impl > frame decoded from buffer
 DEBUG fred::protocol::connection     > fred-JUKIBXKdhV: Read client ID: Resp2(Integer(3))
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Set centralized connection closed sender.
 TRACE mio::poll                      > deregistering event source from poller
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Reconnect task finished reconnecting or syncing with: Ok(())
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Sending 0 commands after reconnecting.
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Emitting connect message.
 TRACE tokio_util::codec::framed_impl > attempting to decode a frame
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Recv 0 bytes from 10.0.78.183:6379 (RESP2).

And now we have the problem - if we try to execute a command using the client from the connection pool:

 TRACE fred::multiplexer::commands    > fred-JUKIBXKdhV: Recv command on multiplexer SET. Buffer len: 0
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Will block multiplexer loop waiting on SET to finish.
 TRACE fred::multiplexer              > fred-JUKIBXKdhV: Skip waiting on cluster sync.
 DEBUG fred::multiplexer::utils       > fred-JUKIBXKdhV: Writing command SET to redis:6379
 TRACE fred::protocol::connection     > fred-JUKIBXKdhV: Sending command and flushing the sink.
 TRACE redis_protocol::utils          > allocating more, len: 0, amt: 40
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Encoded 40 bytes to 10.0.78.183:6379. Buffer len: 40 (RESP2)
 TRACE tokio_util::codec::framed_impl > flushing framed transport
 TRACE tokio_util::codec::framed_impl > writing; remaining=40
 TRACE tokio_util::codec::framed_impl > framed transport flushed
 DEBUG fred::multiplexer::commands    > fred-JUKIBXKdhV: Waiting on last request to finish without pipelining.
 TRACE tokio_util::codec::framed_impl > attempting to decode a frame
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Recv 5 bytes from 10.0.78.183:6379 (RESP2).
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Parsed 5 bytes from 10.0.78.183:6379
 TRACE tokio_util::codec::framed_impl > frame decoded from buffer
 TRACE fred::multiplexer::responses   > fred-JUKIBXKdhV: Processing response from redis:6379 to SET with frame kind SimpleString
 TRACE fred::multiplexer::responses   > fred-JUKIBXKdhV: Writing to multiplexer sender to unblock command loop.
 WARN  fred::multiplexer::responses   > fred-JUKIBXKdhV: Error sending cmd loop response: ()
 TRACE fred::multiplexer::responses   > fred-JUKIBXKdhV: Responding to caller for SET
 WARN  fred::multiplexer::responses   > fred-JUKIBXKdhV: Failed to respond to caller.
 TRACE tokio_util::codec::framed_impl > attempting to decode a frame
 TRACE fred::protocol::codec          > fred-JUKIBXKdhV: Recv 0 bytes from 10.0.78.183:6379 (RESP2).
resp Err(Redis Error - kind: Timeout, details: Request timed out.)

The command is actually executed (we can verify this using redis-cli and checking that the key has been SET in the redis cluster.). All following requests using this client will now result in timeouts (although the command is in fact executed!) .

We use tokio v1.20.1 for reference.

If we use a headless service in k8s, we do not experience this problem (as the DNS will resolve with 0 ip-addresses when the deployment is scaled to 0 replicas, and resolve the new ip-address once the pod has been moved to the other node).

Sep 01 '22 15:09 birkmose

Hi @birkmose, thanks for the comprehensive bug report.

This seems related to https://github.com/aembke/fred.rs/issues/59, at least in terms of the underlying issue with the implementation. Unfortunately the fix to either is highly invasive, but that's what I've been working on for the last several weeks. Version 6 will fix both and I'm hoping to have that out in the next week or two.

Just as a heads up - I may reach out in the next week or two for some feedback on a CI repro for this. So far I've tried to avoid adding minikube to the mix in CI, but that seems unavoidable at this point.

Sep 01 '22 16:09 aembke

I may reach out in the next week or two for some feedback on a CI repro for this

Happy to help out!

Sep 02 '22 07:09 birkmose

We are also running this library on a redis cluster running on k8s and ran into the same issue.

Sep 23 '22 18:09 bodymindarts

Hi @aembke Is there any news/progress regarding the issue?

Oct 18 '22 11:10 DDtKey

Hey! @aembke Thanks for the update, I've tried 6.0.0-beta.1 but it looks like the issue still exists 🤔

I receive Timeout Error: Request timed out. (if node was restarted for example or directly after cluster is got up), and for some reason I see Skip reconnecting to x.x.x.x:port in the logs:

TRACE fred::multiplexer::commands: fred-87YuK2J84E: Recv command: MultiplexerCommand { kind: "Reconnect", server: Some(Server { host: "x.x.x.x", port: 6379, tls_server_name: None }), force: false }
DEBUG fred::multiplexer::commands: fred-87YuK2J84E: Reconnecting to Some(Server { host: "x.x.x.x", port: 6379, tls_server_name: None }) (force: false)
DEBUG fred::multiplexer::commands: fred-87YuK2J84E: Skip reconnecting to x.x.x.xx:6379

Btw, one more issue (I can create new one on GH to separate scopes, but this topic looks general enough): I also found the issue related to the cache of hash-slots, it's related to initial init of cache, because it's possible to start application during failover for example(one of the masters isn't assigned yet) , otherwise client won't connect to this node until cluster will be re-synced due to another error. I think it make sense to re-check cluster-slots/nodes when cache doesn't contain the slot, instead immediate of Error (Possible cluster misconfiguration. Missing hash slot owner for {}). I had to cover this issue with my-own wrapper on application lvl, but I think it's a bit incorrect

Nov 29 '22 18:11 DDtKey

Hey @DDtKey, thanks for checking on that so quickly. I'm about a day away from finishing my work on the cleanup branch and then was planning on diving into this one.

One quick thing - the changes you recommended around missing hash slots sound good and I'll get that in this version before I push it to crates.io.

Quick question - is the hostname you're using there a domain name or an IP address? Also, is your repro the same as the original repro @birkmose used? Or are you doing anything differently to repro this?

I'll get back to you later today or tomorrow with more in-depth info though. I've been meaning to add minikube to CI here to get a regression test for @birkmose original issue, but that might take a bit.

Nov 29 '22 19:11 aembke

Quick question - is the hostname you're using there a domain name or an IP address? Also, is your repro the same as the original repro @birkmose used? Or are you doing anything differently to repro this?

Well, I reproduced it mostly by docker-compose, but with the similar behaviour regarding downscaling and scaling again.

So it was domain name (with application inside network). But for IP it was also reproducible when I run it through port mapping to localhost (127.0.0.1) and running application locally

Nov 29 '22 19:11 DDtKey

So this was a fun one. Just for kicks in case anybody was wondering - new tokio changed how flush works on a split socket so that it does nothing if there's no bytes sitting in the underlying buffer. This makes sense as an optimization but also means that it can't be used to test connectivity anymore. I could be wrong on when or where that change occurred with tokio, but the TLDR is that we can't use flush for testing connectivity anymore.

I'll have a patch up in 6.0.0-beta.1 sometime tonight or tomorrow if you want to test it again, but I was able to repro your issue and it seems to be addressed after this change.

Nov 30 '22 00:11 aembke

Hey @aembke , thanks for your efforts!

However I'm still getting timeouts with the same use case, important to notice that the new logs show that the connection is_working and skipping re-connection.

DEBUG fred::multiplexer::commands: fred-XxnvWboQLM: Has working connection: true
DEBUG fred::multiplexer::commands: fred-XxnvWboQLM: Skip reconnecting to 192.168.208.3:6379

But most interesting thing - it happens even after restarting not-related (another slot-range) cluster node. I mean command is supposed to be directed to (for example) node-1 (due to hash-distribution), I'm downscaling/scaling node-2 and it hangs for all subsequent requests until command timeout is exceeded (MONITOR also stops to show commands, so looks like server doesn't receive them at all) 🤔

I also wonder if it possible to allow customize DNS resolver? Probably even with embedded trust-dns-resolver as feature-flag. It would be also really nice!

Nov 30 '22 13:11 DDtKey

Huh, that's interesting. There must be something different with our repros. Can you include more of the trace logs from your repro? I suspect that the cluster may be promoting one of the other nodes to a primary while the first one is down. This can make it a bit tougher to follow what's going on, but the trace logs show the CLUSTER SLOTS response on each reconnection attempt which should help to debug this.

And yeah overriding DNS logic sounds like a good idea, I'll add that too. I have a trait internally that I use for that, so it likely won't take any dependency on trust-dns-resolver directly, but you could certainly set it up to use that instead.

Nov 30 '22 14:11 aembke

Well, you can find part of logs here: gist Let me know if it's not enough, this one quite-quick re-scaling of node (in logs it corresponds to 172.30.0.4) or if tokio logs are noisy. This part is repetitive after first attempt to execute command:

TRACE fred::multiplexer::commands: fred-97WKglxhGm: Recv command: MultiplexerCommand { kind: "Command", command: "ZCOUNT" }
    at /usr/local/cargo/git/checkouts/fred.rs-c7346550584dadf6/0ec0212/src/multiplexer/commands.rs:501 on reactor
TRACE fred::protocol::command: fred-97WKglxhGm: Pipeline check ZCOUNT: true
    at /usr/local/cargo/git/checkouts/fred.rs-c7346550584dadf6/0ec0212/src/protocol/command.rs:1484 on reactor
DEBUG fred::multiplexer::clustered: fred-97WKglxhGm: Writing command `ZCOUNT` to 172.30.0.7:6379
    at /usr/local/cargo/git/checkouts/fred.rs-c7346550584dadf6/0ec0212/src/multiplexer/clustered.rs:66 on reactor
TRACE fred::multiplexer::utils: fred-97WKglxhGm: Sending command ZCOUNT to 172.30.0.7:6379
    at /usr/local/cargo/git/checkouts/fred.rs-c7346550584dadf6/0ec0212/src/multiplexer/utils.rs:129 on reactor
TRACE fred::multiplexer::commands: fred-97WKglxhGm: Sent command to 172.30.0.7:6379. Flushed: false
    at /usr/local/cargo/git/checkouts/fred.rs-c7346550584dadf6/0ec0212/src/multiplexer/commands.rs:155 on reactor
WARN Timeout Error: Request timed out

It's cluster configured with 6 nodes (3 master / 3 replicas) - but it's reproducible without replicas as well.

In the end you can find Timeout Error - it hangs for forever and for any slot. I mean it starts always returning timeout. But I can easily execute commands via redis-cli to the cluster & restored node.

Just if it would be helpful:

[SPOILER] Here simplified `docker-compose` redis-config is (service inside the same network)

  redis-node-0:
    image: docker.io/bitnami/redis-cluster:6.2
    environment:
      - 'ALLOW_EMPTY_PASSWORD=yes'
      - 'REDIS_CLUSTER_REPLICAS=1'
      - 'REDIS_NODES=redis-node-0 redis-node-1 redis-node-2 redis-node-3 redis-node-4 redis-node-5'
      - 'REDIS_CLUSTER_CREATOR=yes'
    depends_on:
      - redis-node-1
      - redis-node-2
      - redis-node-3
      - redis-node-4
      - redis-node-5
    networks:
      - my-net

  redis-node-1:
    image: docker.io/bitnami/redis-cluster:6.2
    environment:
      - 'ALLOW_EMPTY_PASSWORD=yes'
      - 'REDIS_NODES=redis-node-0 redis-node-1 redis-node-2 redis-node-3 redis-node-4 redis-node-5'
    networks:
      - my-net

  redis-node-2:
    image: docker.io/bitnami/redis-cluster:6.2
    environment:
      - 'ALLOW_EMPTY_PASSWORD=yes'
      - 'REDIS_NODES=redis-node-0 redis-node-1 redis-node-2 redis-node-3 redis-node-4 redis-node-5'
    networks:
      - my-net

  redis-node-3:
    image: docker.io/bitnami/redis-cluster:6.2
    environment:
      - 'ALLOW_EMPTY_PASSWORD=yes'
      - 'REDIS_NODES=redis-node-0 redis-node-1 redis-node-2 redis-node-3 redis-node-4 redis-node-5'
    networks:
      - my-net

  redis-node-4:
    image: docker.io/bitnami/redis-cluster:6.2
    environment:
      - 'ALLOW_EMPTY_PASSWORD=yes'
      - 'REDIS_NODES=redis-node-0 redis-node-1 redis-node-2 redis-node-3 redis-node-4 redis-node-5'
    networks:
      - my-net

  redis-node-5:
    image: docker.io/bitnami/redis-cluster:6.2
    environment:
      - 'ALLOW_EMPTY_PASSWORD=yes'
      - 'REDIS_NODES=redis-node-0 redis-node-1 redis-node-2 redis-node-3 redis-node-4 redis-node-5'
    networks:
      - my-net

And I have some other configs with the same behavior (without docker-compose)

Regarding DNS - that would be great! Thanks! 🚀

Nov 30 '22 16:11 DDtKey

Quick update for the folks following this issue - I just published 6.0.0-beta.1 to crates.io. It has several fixes for this issue and the other connection management issues. If you have some time over the next week or so I'd appreciate any feedback on whether it addresses your use case or not. My goal is to publish 6.0.0 by the 20th.

Dec 12 '22 21:12 aembke

Thanks to @aembke and all the contributors for all the effort, many cases were covered in this release and now it looks much more stable (at least it works well for both self-hosted and AWS instances in my tests) 👍 I really appreciate that you responsively helped with important issues 🙏

Dec 13 '22 00:12 DDtKey

Appreciate the kind words @DDtKey.

To be fair to the others in this thread though, and being totally frank here... "responsive" is maybe a bit of a stretch. The majority of these issues popped up for folks right as I stepped away from the repo to travel, etc. The timing was unfortunate, and for what it's worth it's not my intention to let things like this linger for 2+ months again. I'll likely put out a call for more contributors soon to address that, but I want to focus on getting a more reliable and sustainable build out first.

(Figured I'd say what I'm guessing several folks in this thread are thinking, but are probably too nice to write here.)

Dec 13 '22 00:12 aembke

Thank you for the great work @aembke ! I just tested that 6.0.0-beta.2 fixes the problem I was observing, and I can no longer reproduce the error. (Can consistently reproduce with 5.2.0). We will probably roll out this in production after New Year, once the stable version 6.0.0 hits crates.io. Again - thank you for the hard work! 🎅 came early this year!

Dec 13 '22 12:12 birkmose

Thanks for checking on that @birkmose. I'm going to close this for now then, and I'll post an update once 6.0.0 is on crates.io.

Dec 14 '22 17:12 aembke

fred.rs fred.rs copied to clipboard

Connection management issues

fred.rs
fred.rs copied to clipboard