redis-rdb-tools
redis-rdb-tools copied to clipboard
printing msgpack keys correctly
In redis my keys look like this:
a:b:c:c:\xcd\xe9\xba
where \xcd\xe9\xba is a msg pack portion
when I run rdb command like so I am getting bad output:
rdb --command justkeys --key "*8767678.*" data/datafeeds-redis/dump.rdb
The output looks like this:
a:{8767678}:b:c:��K
you can see the strange characters ��
I've tried to add the --escape raw option but it still outputs the strange characters.
Any way around this?
this seems to work --escape print
Is that the correct way? thanks!
Yes. That is If you want to display in console, not store to a file...
I would like to output all of the keys matching the expression to a file.
In that case what is the appropriate action?
Avner
On Jun 19, 2018, at 10:28 PM, Oran Agra [email protected] wrote:
Yes. That is If you want to display in console, not store to a file...
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
it depends on what you want to do with the file. if you want to print it, or you want it to contain binary (non printable) data.
your keys contains binary (non printable) data, but i see your first post in this thread, that you were expecting for escaped output (probably to match what you got from redis-cli), so in that case, you probably want --escape print.
I want to do some "expensive" scan on the db file (offline) and then use the keys later. Is it correct to do it as a "printable" output?
depends on how you'll use the output (key names), if you wanna later feed them to redis-cli, yes, if you wanna write a program that works with it, then you may want the unescaped version... anyway, this totally depends on what you wanna do with the output. printable output is usually used in order to display the data on a terminal or copy it to the clipboard....
Thanks!
I've gone ahead and created a small python util that works on particular key pattern. I saw that you provided a cli --command justkeys --key ".*pattern.*"
I took your example code in the readme and did something similar to the flag
(very trivial example):
class MyCallback(RdbCallback):
....
....
def set(self, key, value, expiry, info):
# i only care about the "key" portion
if simple_pattern not in key:
return
if condition on key is true:
do something with (key) against production db
but since the key is somewhat "sparse" in the file, and my rdb file is 60GB this is quite slow.
Is there any assumption about the ordering of the keys (are they sorted somehow?) and can I know when all keys of a pattern have been exhausted?
The redis cluster keys are "sharded" using the notation a:b:{shard}:c:d..., so perhaps there is some notion of grouping of the keys.
Is there a way to hint the parser to disregard irrelevant keys or is this something I need to do in the client callback?
Also is it possible to scan the file from multiple threads to speed it up? (taking quite a lot of time to scan)
keys in rdb are not sorted, and redis-rdb-tools processes them as they are in the rdb.
the parser can't really completely disregard keys, in most cases it must parse the full key and value in order to skip it's bytes. there are cases, for ziplist encoded values in which it would be possible to speed up the processing by skipping a the value instead of extracting individual elements and calling their callbacks, but this is not currently implemented. in most of the above mentioned cases, the ziplist encoded values will be small (not containing many items).
actually i take it back.. i see that we already efficiently skip whatever we can skip when you use --key