go-redis icon indicating copy to clipboard operation
go-redis copied to clipboard

Why does go-redis allow cluster mode cross-shard transactions and pipelines?

Open LINKIWI opened this issue 1 month ago • 0 comments

Cross-shard/cross-slot batch operations in cluster mode-enabled clusters, like transactions and pipelines, are not supported in a Redis cluster, by specification.

  • There is no server-side primitive that enables cross-shard transactional atomicity.
  • A single pipeline cannot physically be applied to multiple targets, since it occurs over a single physical connection.

Other client libraries properly reject these operations with a cross-slot validation error. The go-redis library seems to instead (1) partition the batch commands by slot, (2) concurrently dispatch individual batch commands to each shard, and (3) aggregate the results client-side.

https://github.com/redis/go-redis/blob/v9.5.2/osscluster.go#L1237

As a simple reproducible example:

127.0.0.1:7000> set a aaaaa
-> Redirected to slot [15495] located at 127.0.0.1:7002
OK
127.0.0.1:7002> set b bbbbb
-> Redirected to slot [3300] located at 127.0.0.1:7000
OK
ctx := context.Background()
pipe := client.TxPipeline()
pipe.Get(ctx, "a")
pipe.Get(ctx, "b")
fmt.Println(pipe.Exec(context.Background()))  // [aaaaa bbbbb]

This is both extremely surprising behavior, and the source of potentially severe correctness/consistency problems. I don't see any mechanism to opt out or otherwise disable the "smart virtual batching" exercised by the library.

Why does go-redis do this? Is it possible to implement a client option to disable this behavior?

Expected Behavior

Cross-slot transactions and pipelines should return an error.

Current Behavior

go-redis internally separates out commands by shard and executes them independently without error.

Possible Solution

Provide a switch to allow the client to reject batch operations when they per-shard partitioning hashes the set of commands to multiple slots.

Steps to Reproduce

  1. Create a cluster with at least 2 nodes.
  2. Write sample keys spanning both nodes.
  3. Exercise the code snippet above.
  4. Observe that no error is returned.

Context (Environment)

Clients are expecting transactional atomicity which can only be guaranteed when the transaction affects keys that hash to the same slot.

Detailed Description

See above.

Possible Implementation

N/A

LINKIWI avatar Jun 06 '24 16:06 LINKIWI