Cursor based operations not working properly with ReadFrom = ANY
Bug Report
Current Behavior
Hello!
We have encountered the following problem - when using zscan with Redis Cluster and the ReadFrom parameter set to ANY, consecutive requests may be sent to different nodes. Because of this, the response does not comply with the guarantees provided by Redis regarding SCAN family operations. The response may contain duplicate values, and some values may be missing altogether.
Small example - https://github.com/EMDavl/cursor-bug
Expected behavior/code
SCAN family requests return correct values with the ReadFrom parameter set to ANY when the library is used with Redis Cluster. Requests are sent to the same node.
Environment
- Lettuce version(s): 6.3.2.RELEASE
- Redis version: 7+
Possible Solution
Send requests related to scan based operations to the same node untill iteration ends
@tishun Hi! Is there any updates regarding this topic?
Will try to take a look this week, apologies for the delay
Hey @EMDavl ,
/**
* Setting to read from any node.
*
* @since 5.2
*/
public static final ReadFrom ANY = new ReadFromImpl.ReadFromAnyNode();
Am I correct to assume you want to both use ReadFrom.Any and in the same time (only for the SCAN family of commands) read from the same node? Can you elaborate more on the use case? I assume ReadFrom.LOWEST_LATENCY is - for some reason - not going to work for you?
Hello again,
I've taken the liberty of modifying your code and it now reads all the values:
Set<String> set = new HashSet<>();
ScoredValueScanCursor<String> cursor = null;
do {
if (cursor == null) {
cursor = sync.zscan(zsetName, ScanCursor.of("0"), ScanArgs.Builder.limit(1000));
} else {
cursor = sync.zscan(zsetName, ScanCursor.of(cursor.getCursor()), ScanArgs.Builder.limit(1000));
}
for (ScoredValue<String> value : cursor.getValues()) {
set.add(value.getValue());
}
} while (!cursor.isFinished());
Does that help?
Hello!
Unfortunately, the solution you suggested didn't resolve the issue. I replaced it with what I had in my MRE, and in each of the three test runs, only part of the records were read. Perhaps you could share how you tested its functionality?
Can you elaborate more on the use case
Initially, our project was configured with the ReadFrom.ANY parameter to reduce the load on the nodes. After noticing the issue, we switched to the option you suggested, and so far, it has worked for us. The goal of this issue is to bring this problem to the attention of the development team, so that this behavior can either be fixed or documented.
you want to both use
ReadFrom.ANYand at the same time (only for the SCAN family of commands) read from the same node
Not necessarily; the main requirement is that the SCAN family of commands, when used with ReadFrom.ANY and a Redis cluster, should return all values.
Greetings again,
Unfortunately, the solution you suggested didn't resolve the issue. I replaced it with what I had in my MRE, and in each of the three test runs, only part of the records were read. Perhaps you could share how you tested its functionality?
What I did is deployed an new empty cluster and run the code, with the change I suggested https://github.com/EMDavl/cursor-bug/pull/1
This resulted in the (for me) expected result being output:
Entries read: 10000
While without my change the result was random, but always less than 10000, for ex.:
Entries read: 7602
Can you elaborate more on the use case
The issue is that the order in which you check cursor.isFinished() and then call cursor = sync.zscan(zsetName, ScanCursor.of(cursor.getCursor()), ScanArgs.Builder.limit(1000)); is reversed - you need to check if the cursor is finished only after calling the zscan() on the new cursor, otherwise you are not following up on all the cursors. You can debug this by adding a new call just above the break statement and you will see that more results re returned, but due to the break statement they are not processed.
Initially, our project was configured with the
ReadFrom.ANYparameter to reduce the load on the nodes. After noticing the issue, we switched to the option you suggested, and so far, it has worked for us. The goal of this issue is to bring this problem to the attention of the development team, so that this behavior can either be fixed or documented.you want to both use
ReadFrom.ANYand at the same time (only for the SCAN family of commands) read from the same nodeNot necessarily; the main requirement is that the SCAN family of commands, when used with
ReadFrom.ANYand a Redis cluster, should return all values.
You are correct! I was mistakingly assuming the driver does not follow up on the contract, but in fact it does, even when the ReadFrom.ANY is used. So you should be able to use ReadFrom.ANY and still be able to fetch the cursor from multiple different shards.
Thank you very much for your time and clarifications. Unfortunately, even with the proposed solution, I am still getting inconsistent results. It might be due to my environment. I will try to carefully double-check everything again over the upcoming weekend and will get back to you with the results.
Apologies for the delay, the last few weeks have been quite busy, and the next couple don't promise to be any freer. I'll reply to the thread as soon as I'm able to double-check everything
Sure, I will be happy to help if there is still something unclear or you discover an issue with the driver
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 30 days this issue will be closed.