ioredis icon indicating copy to clipboard operation
ioredis copied to clipboard

commandTimeout has unexpected behavior for cluster

Open spurrkins opened this issue 2 years ago • 3 comments

The command timeout is only set on the request to an individual Redis instance: https://github.com/luin/ioredis/blob/v5.0.6/lib/Redis.ts#L422

But, if there's a network partition, in cluster mode commands will be enqueued in the offline queue indefinitely, without ever setting the timeout: https://github.com/luin/ioredis/blob/v5.0.6/lib/cluster/index.ts

For certain workloads, this could result in a total denial of service as commands will enqueue indefinitely until the partition is resolved. I think expected behavior would be to honor the command timeout in cluster mode as well.

spurrkins avatar Jun 14 '22 19:06 spurrkins

We just encountered this issue as well. I was under the impression that commandTimeout is set regardless of whether the command was actually sent to any node, but apperatenly not.

We need a way to timeout "offline" commands

shaharmor avatar Aug 15 '22 20:08 shaharmor

I worked around it by turning off the offline queue and implementing retries w/ circuit breaking around my operations against the client. Was able to solve my problem but it's far from a comprehensive solution.

spurrkins avatar Aug 15 '22 23:08 spurrkins

I came across this and have checked through the implementation

The command timeout is only set on the request to an individual Redis instance

in cluster mode commands will be enqueued in the offline queue indefinitely, without ever setting the timeout

This is not true as the implementation of cluster client is also based on the Redis instance. The redis option with commandTimeout will be set for each redis instances in the connection pool. Ideally the commandTimeout is made to provide timeout on actual redis command execution. When the redis cluster isn't ready the command will just be queued up in the offlineQueue without executing the command. Once the readyHandler is triggered the commands queued up will be executed and since it's calling redis.sendCommand the timeout countdown will be started in the Redis instance.

I think this feature was designed for using redis as message queue to minimize data lost when redis is not available. It will be confusing when we just want a fast-failing call on caching. I guess we might want something like callTimeout which setting the timeout within the command.setTimeout function in Cluster that handles most of the cases.

In case you want a hackaround, this might work

const originalSendCommand = Cluster.prototype.sendCommand
Cluster.prototype.sendCommand = function (command, stream, args) {
  command.setTimeout(1000)
  return originalSendCommand.call(this, command, stream, args)
}

Seitk avatar Jul 27 '23 15:07 Seitk