kvrocks Duplicate data was generated during expansion.

Search before asking

[x] I had searched in the issues and found no similar issues.

Version

2.10

Minimal reproduce step

Env: Kvrocks cluster with 4 shards.

Using go-redis and opening multiple goroutines to concurrently generate data continuously.

userId := helper.GetRandomInt64InRange(1, 10000)
key := strconv.FormatInt(userId, 10)

uuidStr := uuid.New().String()
timestamp := time.Now().UnixNano()
value := fmt.Sprintf("%s-%d", uuidStr, timestamp)

_, err := c.Do(context.TODO(), 2, "LPUSH", key, value)

Use the following command to expand the Kvrocks cluster from 4 shards to 8 shards.

CLUSTERX SETNODEID $NODE_ID $NODE_ID
CLUSTERX MIGRATE $slot $dst_nodeid
CLUSTERX SETSLOT $slot NODE $node_id $new_version

Check data after expand.

What did you expect to see?

no duplicate data was generated.

What did you see instead?

[root@iZwz9dff7pa0adxvbwkbc8Z ~]# ./check_redis_list
2025/01/16 13:34:20 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
LIST 1978 has duplicate data:
duplicate data: 99b22bda-96de-43ec-98d4-d74d5a590d11-1736994529584945168
duplicate data index: [43 44]
list length: 395

LIST 3171 has duplicate data:
duplicate data: 9bb7a7cc-3004-4d5b-965d-fe171badc21a-1736994509584166314
duplicate data index: [86 87]
list length: 412

LIST 3468 has duplicate data:
duplicate data: 9c3bc47b-d92c-4d55-b588-9fbc594849fc-1736994519583905106
duplicate data index: [65 66]
list length: 406

LIST 3616 has duplicate data:
duplicate data: 41342d98-ace0-4212-b9ea-437772a17e68-1736994539585064098
duplicate data index: [32 33]
list length: 376

LIST 5519 has duplicate data:
duplicate data: c9fb599e-453e-4615-bc7c-74b1b8a8bec0-1736994529583967377
duplicate data index: [51 52]
list length: 414

LIST 6445 has duplicate data:
duplicate data: 3d1c3404-af08-4d47-a0f8-2fa0cea0885a-1736994509583942704
duplicate data index: [87 88]
list length: 377

LIST 8973 has duplicate data:
duplicate data: ab937bce-f43f-4f2c-a65a-b1c6cc5f08b9-1736994529584052504
duplicate data index: [55 56]
list length: 414

LIST 9250 has duplicate data:
duplicate data: 1550b5fc-ba5a-4d88-9b9d-daf9c52650ef-1736994509583124277
duplicate data index: [82 83]
list length: 421

LIST 9555 has duplicate data:
duplicate data: 606ab92e-4052-43fb-900d-91e45892ef2a-1736994529583492941
duplicate data index: [59 60]
list length: 395

Anything Else?

No response

Are you willing to submit a PR?

[ ] I'm willing to submit a PR!

Jan 16 '25 05:01 VIVALXH

@VIVALXH Great! thanks for your detailed information, it's helpful to identify this isue.

Jan 17 '25 03:01 git-hulk

Hi @VIVALXH

go-redis might retry on network error. Would you mind disabling the retry to see if the issue is still existing? https://github.com/redis/go-redis/blob/1139bc3aa9073851f67faa6d68df07a566901dd7/options.go#L75C2-L75C12

Feb 02 '25 09:02 git-hulk

@git-hulk thanks.

No duplicate data was generated, but this resulted in the loss of a significant amount of data that should have been written.

I controlled the osscluster retry count by setting redis.UniversalOptions.MaxRedirects = -1. (https://github.com/redis/go-redis/blob/1139bc3aa9073851f67faa6d68df07a566901dd7/osscluster.go#L45)

However, this inevitably led to Moved errors and TryAgain errors as mentioned in this PR(https://github.com/apache/kvrocks/pull/1240).

The Moved error occurs between the following two commands:

CLUSTERX MIGRATE $slot $dst_nodeid
CLUSTERX SETSLOT $slot NODE $node_id $new_version

Additionally, if the interval between these two commands is longer (e.g., 10 seconds), the probability of generating duplicate data increases. The duplicate data primarily appears on the new nodes after scaling out. Following the process I described, this issue can be consistently reproduced.

Feb 06 '25 07:02 VIVALXH

I controlled the osscluster retry count by setting redis.UniversalOptions.MaxRedirects = -1.

@VIVALXH You should set MaxRetries to -1 instead of MaxRedirects. Or it will prevent retrying while receiving the TRYAGAIN error.

Feb 06 '25 12:02 git-hulk

@git-hulk

I am using ClusterClient, and internally it uses MaxRedirects instead of MaxRetries in its process function. It worked as expected and achieved our goal of preventing duplicate data. https://github.com/redis/go-redis/blob/1139bc3aa9073851f67faa6d68df07a566901dd7/osscluster.go#L967

This suggests that retries might have caused the data duplication, but in reality, we cannot set the retries to zero because it lost amount of data.

Feb 06 '25 13:02 VIVALXH

@VIVALXH The cluster client also supports the MaxRetries, set MaxRedirects=-1 wouldn't retry when occuring the TRYGAIN error.

Feb 06 '25 14:02 git-hulk

Hi @VIVALXH

go-redis might retry on network error. Would you mind disabling the retry to see if the issue is still existing? https://github.com/redis/go-redis/blob/1139bc3aa9073851f67faa6d68df07a566901dd7/options.go#L75C2-L75C12

@git-hulk Now we conclude that the retry may have caused the duplicate data, but in practice it is impossible to cancel the retry.

Feb 06 '25 14:02 VIVALXH

@VIVALXH The cluster client also supports the MaxRetries, set MaxRedirects=-1 wouldn't retry when occuring the TRYGAIN error.

@git-hulk Hello, I tried setting MaxRetries=-1 today, but duplicate data still occurred. Only when I set MaxRedirects=-1 did the duplicate data stop.

So I guess the data duplication is caused by retries due to Moved errors or TryAgain errors?

Feb 07 '25 01:02 VIVALXH

So I guess the data duplication is caused by retries due to Moved errors or TryAgain errors?

I'm not quite sure about this. But after considering this issue, it's impossible to push an element exactly once in a list without the deduplicate mechanism. Because we always need to retry once there are any network issues or timeout, or it might lose data.

Feb 07 '25 12:02 git-hulk

So I guess the data duplication is caused by retries due to Moved errors or TryAgain errors?

I'm not quite sure about this. But after considering this issue, it's impossible to push an element exactly once in a list without the deduplicate mechanism. Because we always need to retry once there are any network issues or timeout, or it might lose data.

yes, it is impossible to cancel the retry because it lost amount of data.

At present, the situation of duplicate data only occurs when scaling. Is there any other solution that can avoid this?

Feb 07 '25 14:02 VIVALXH