fred.rs
fred.rs copied to clipboard
[Bug] refreshing cluster slot owners after failed cluster is recovered
Redis version - 6.2.7 Platform - linux Using Docker and/or Kubernetes - yes Deployment type - cluster
Describe the bug If the cluster is in genuinely misconfigured / failed state (which I'm not sure how to reproduce, and would generally like to avoid altogether) but then recovers, the fred clients are not able to (I presume) refresh the cluster slots distribution when cluster is back healthy (whether it recovered, or got completely restarted / replaced).
As we retry on failed connections, all I can see are errors like below
Logs
"timestamp":"2023-12-26T09:27:36.487891Z","level":"WARN","fields":{"message":"fred-G4vFWfuJWz: Possible cluster misconfiguration. Missing hash slot owner for Some(6606).","log.target":"fred::router::clustered","log.module_path":"fred::router::clustered","log.file":"/usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/fred-6.2.1/src/router/clustered.rs","log.line":88},"target":"fred::router::clustered","threadName":"tokio-runtime-worker"
Additional context Add any other context about the problem here.
Hi @to266 , can you try with 7.1.1?
Will do, but:
- Not sure how quickly we'll manage to update fred in our repo in the first place
- We will likely only see a similar level of load at the end of the month - so until then it should all be good regardless.
Having said that, thanks!
I'd recommend trying 7.1.2 if you can. That release contains a fix for a similar kind of issue.
Closing due to inactivity, but if you still have issues here after 8.0.2 please let me know. There were ~5 potentially relevant fixes for this between 6.3.2 and 8.0.2, so hopefully those address this.