corvus icon indicating copy to clipboard operation
corvus copied to clipboard

KEYS should be possible

Open sgohl opened this issue 5 years ago • 4 comments

From my perspective, the limitation of not being able to get all KEYS is a total dealbreaker.

Being sure, that this is only a design limitation made willingly, I want to ask politely if you really don't see any chance of implementing this anyhow.

Technically, it's just that all cluster nodes have to be SCAN from corvus and responded to the client just as if it was one single instance. This may be slower, but better than not available.

Since SCAN is also not supported, there seems to be no alternative for getting all keys in a cluster.

Is there any technical reason why this can't be done?

sgohl avatar May 15 '19 10:05 sgohl

scan requires iterator. Iterators are different from each group of nodes. So there are mainly two ways to implement it:

Altogether.

You send a simple command scan 2 to proxy and receive a list of result like:

10.10.99.21:6379

1 ) 29173 2 )

1 ) key1 => val1 ... N) keyN => valN

10.10.99.25:6393

1 ) 33123 2 )

1 ) key1 => val1 ... N) keyN => valN

So you must take all the cursors as parameters to proxy to fetch next page. Command like scan 10.10.99.21:6379:29173 10.10.99.25:6393:33123 ...

Oh it's complicated.

One Backend Each Time

It will be better. But it also need to specify the backend and cursor together to do scanning.

Scenario

In most scenarios that you need scan is something in operations. But we have more ways to do it including fetch the map of the cluster and do it in a script.

So what's your scenario here?

jasonjoo2010 avatar May 20 '19 06:05 jasonjoo2010

Hey, thanks for your time. There's a big possibility here that I am doing or thinking wrong due to lack of knowledge to details or inner workings.

But my understanding of the sense of a proxy in that case would be the simulating the returns of a single redis instance. The actual need for KEYS came from the simple point of view that you just want to get all keys because there's no possibility to pre-iterate; the client does not know the names of the Keys he wants, so to speak.

Maybe I am missing something completely, or there's a better way of doing it

sgohl avatar May 20 '19 07:05 sgohl

Hey, thanks for your time. There's a big possibility here that I am doing or thinking wrong due to lack of knowledge to details or inner workings.

But my understanding of the sense of a proxy in that case would be the simulating the returns of a single redis instance. The actual need for KEYS came from the simple point of view that you just want to get all keys because there's no possibility to pre-iterate; the client does not know the names of the Keys he wants, so to speak.

Maybe I am missing something completely, or there's a better way of doing it

That depends.

KEYS

The command KEYS is evil marked on official website redis.io because it costs much especially there are so many keys exists. (For single instance) Because REDIS is running in serial executing model all following commands are blocked during execution of KEYS. So it's recommended replacing it into SCAN officially.

SCAN

First let's go to the difference:

  1. It doesn't support by prefix.
  2. It will not ensure that one key occurs once. That means it will be unique in one calling but may not be unique between callings.
  3. It will not generate a collection in one time. We call it walk is more exact to list compared to KEYS.
  4. Though you will get a number to be used next calling but the callings may be unlimited(walk forever).

So implementing it is complicated.

Scenario

So we need turn back to the scenario we meet. If it's a common logic we should turn to other implementation. (Change the logic)

But if you for some operations' reason (eg. Deleting some keys set wrong, thus, a repair after suffering a wrong logic) you can write a script to walk the cluster manually.(Bypass the proxy)

Does it help?

jasonjoo2010 avatar May 29 '19 03:05 jasonjoo2010

KEYS command is dangerous. SCAN is not that easy to implement which requires storing two indices inside one number. Anyway, I recommend use some replication tools (redis-migrate-tool) to get all the data especially for analysing the stored data.

doyoubi avatar Jun 21 '19 03:06 doyoubi