python-rocksdb icon indicating copy to clipboard operation
python-rocksdb copied to clipboard

multi_get and duplicate keys

Open JS-Parent opened this issue 5 years ago • 3 comments

The doc for multi_get mentions that:

keys will not be “de-duplicated”. Duplicate keys will return duplicate values in order. https://python-rocksdb.readthedocs.io/en/latest/api/database.html

But when I use it with duplicated keys I get a dictionary with a single key whose value is not a list with the repeated values:

Example:

import rocksdb

db = rocksdb.DB("/tmp/", rocksdb.Options(create_if_missing=True))
db.put(b'\x00', b'\x00')
d = db.multi_get([b'\x00', b'\x00', b'\x00'])
print(d)
print(type(d[b'\x00']))

outputs:

{b'\x00': b'\x00'}
<class 'bytes'>

JS-Parent avatar Aug 06 '20 21:08 JS-Parent

Hi, the document says for that function: Returns: | A dict where the value is either bytes or None if not found

iFA88 avatar Aug 07 '20 06:08 iFA88

BTW what you mentioned NOTICE is right, you put your keys in a list, then the function will get from the database every item what is on the list, after that they will put into a dict and overwrites it when there are duplicates. If you request one single key 1M times, it will be read from the database 1M times and you got only a dict with one key.

iFA88 avatar Aug 07 '20 06:08 iFA88

Yes, I understand that the database is queried twice when the list of keys contains the twice the same key. However, given that python dict's cannot store duplicated keys and the documentation states: Duplicate keys will return duplicate values in order. I expected my code snippet to return:

{b'\x00': [b'\x00', b'\x00']}
<class 'list'>

since this seems to be the pythonic way of storing duplicated entries in a dict. Right now it's confusing, yes the duplicated values are returned but python dict removes them so they are de-duplicated while the doc says the opposite. If this is the intended behavior I would just change the doc to make this more explicit.

JS-Parent avatar Aug 07 '20 13:08 JS-Parent