redis-rb
redis-rb copied to clipboard
Redis::ProtocolError: Got 'Protocol error, got "\r" as reply type byte' as initial reply byte. If you're in a forking environment, such as Unicorn, you need to connect to Redis after forking.
I recently migrated from using a Redis instance to a Redis cluster in my Rails app. And to use the benefits of cluster mode in redis-rb, I changed my connection string to:
redis = Redis.new(:cluster => %W[redis://redis-url:redis-port])
But, I am getting intermittent connection issues with the error:
"Redis::ProtocolError: Got 'Protocol error, got "\r" as reply type byte' as initial reply byte. If you're in a forking environment, such as Unicorn, you need to connect to Redis after forking"
And the stacktrace is as follows:
redis-4.0.3/lib/redis/connection/hiredis.rb:60 in rescue in read redis-4.0.3/lib/redis/connection/hiredis.rb:53 in read redis-4.0.3/lib/redis/client.rb:265 in block in read redis-4.0.3/lib/redis/client.rb:253 in io redis-4.0.3/lib/redis/client.rb:264 in read redis-4.0.3/lib/redis/client.rb:123 in block in call redis-4.0.3/lib/redis/client.rb:234 in block (2 levels) in process redis-4.0.3/lib/redis/client.rb:372 in ensure_connected redis-4.0.3/lib/redis/client.rb:224 in block in process redis-4.0.3/lib/redis/client.rb:309 in logging redis-4.0.3/lib/redis/client.rb:223 in process redis-4.0.3/lib/redis/client.rb:123 in call redis-4.0.3/lib/redis/cluster.rb:215 in public_send redis-4.0.3/lib/redis/cluster.rb:215 in try_send redis-4.0.3/lib/redis/cluster.rb:151 in send_command redis-4.0.3/lib/redis/cluster.rb:72 in call redis-4.0.3/lib/redis.rb:1343 in block in sadd redis-4.0.3/lib/redis.rb:50 in block in synchronize
I can understand why this was not an issue when I was using a single Redis instance with the connection string like:
redis = Redis.new(url: "redis://redis-url:redis-port/db")
As the lib/redis/client.rb has an ensure_connected function to reconnect when any errors crop up during forking of new child processes that use the redis client. But I could not find any piece of code that handles reconnections in lib/redis/cluster.rb or lib/redis/cluster/node.rb that is used in cluster mode.
How can I fix this issue when I am connecting via cluster mode?
when any errors crop up during forking of new child processes that use the redis client.
Do you disconnect in the parent before fork?
No, I never did. redis-rb always took care of the reconnections as that did not require me handling the disconnections manually in code during forking process.
Hum, we might be missing this check in the cluster client. I'd say try that.
Yes, I think the same as well. But instead of disconnecting in the parent, I would prefer reconnecting in the child or forked processes. Can you point me to the right piece of code in cluster client, so that I can override it to include the ensure_connected function to reconnect when any errors occur?
Nevermind, I see the ensure_connected in your backtrace, but it explictly raise if you re-use the connection: https://github.com/redis/redis-rb/blob/af0b66cf99cc8e9f427452f8eb9da6366dc6257c/lib/redis/client.rb#L394-L399, so I doubt it's that.
Unless you configured your client with inherit_socket: true?
No, @byroot. I did not configure that option. It is very strange as to why I get this issue only with Redis cluster connection string.
@byroot How and where do you suggest I reconnect to Redis in my Rails application? I initialise a Redis client in one of the initializer files. Is there a Redis.reconnect that I can use?
Most forking servers or job runners provide some "before_fork" and "after_fork" callbacks. That's where you should handle the disconnect and reconnect.
@byroot Found the issue to be related with Delayed Jobs in my application. For some reason, the Redis cluster connection has issues when the Delayed Job is a long running one. One additional thing to consider is that I am using Redis namespace on top of my Redis client. Is this a known issue? If so, how can I go about fixing this?
For some reason, the Redis cluster connection has issues when the Delayed Job is a long running one
Hum, doesn't ring any bell. Could be the cluster code not handling reconnection properly, but sounds weird.
I am using Redis namespace on top of my Redis client
Should have 0 impact on protocol.
Is this a known issue?
Well, even with that extra information, it's still very unclear what's going on.
Could it be possible that the error message is incorrect? Maybe it is a data issue? And also, would this issue be fixed if we do not use hiredis?
Update: @byroot I have not been able to reproduce the issue since I removed hiredis from my application. Could the problem be a combination of using Redis cluster and hiredis?
I have not been able to reproduce the issue since I removed hiredis from my application. Could the problem be a combination of using Redis cluster and hiredis?
Interesting. The hiredis driver is barely maintained, so it's possible it has some fork safety issues.
Regardless all this code was just entirely re-written, so I'll close assuming it's likely solved.