RocksDict
RocksDict copied to clipboard
How to get WideColumns data in " Raw Mode" from rocksdb
The db is created in C++ app and contains a lot of WideColumns data. Is it possible to access these data using RocksDict? In the code example below value (v) always is 0 but keys (k) is shown as expected.
cf_lst=Rdict.list_cf(GLB_PATH)
opts=Options(raw_mode=True)
db = Rdict(path=GLB_PATH, options=opts, column_families={cf_lst[1]: opts})
db_cf1 = db.get_column_family(cf_lst[1])
#
for k, v in db_cf1.items():
print(f'k={k}, v size={len(v)}')
db.close()
OS: Windows 2019 Server compiler: msvc 19.39.33523 rocksdb: v9.0 RocksDict: v0.3.23
Looks like WideColumns is not yet supported here, and we will need to add APIs like GetEntity
and PutEntity
to support it.
Quoting rocksdb wide column doc:
The classic Get, MultiGet, GetMergeOperands, and Iterator::value APIs return the value of the default column when they encounter an entity, while the new APIs GetEntity, MultiGetEntity, and Iterator::columns return any plain key-value in the form of an entity with a single column, namely the anonymous default column.
Iterator returns the default value, which is empty. Needs columns()
api, which is not yet supported by rocksdict yet.
Just curious, what do you use WideColumns for?
For the moment, if the object is not that large, I would suggest to use some custom deserialization for the entities. The APIs related to WideColumns have not yet been explosed to C interface yet by rocksdb. So, I would need some time and wait for rocksdb to design a proper C interface for WideColumns related APIs.
Related: https://github.com/facebook/rocksdb/issues/12635
Some kind of in-memory tables with random culumn's number in each row which are being frequently updated. I have found that using WideColumns fits well with my app architecture and allowed me to easily migrate from kx kdb.
For iterating with python I'm going to create a special copies of several tables using MessagePack serializer for the entities in the way you proposed. But it is some kind of overhead.
I've already drafted an up-stream PR: https://github.com/facebook/rocksdb/pull/12653
Check wide_columns_raw examples with pip install rocksdict==0.3.24b1
(pypi link).
Tell me if it works 🙂.
No success. From real db I cannot access wide columns from column family (CF). Please provide a simple example how to use the get_entity
method with CF. The db itself works fine, checked it with ldb
tool.
I'm about to release a beta.2, which will make opening DB created by other languages (c++, java, rust) much straightforward.
It seems I didn’t clearly explain the problem. In other words, I can’t figure out how to pass CF to the get_entity method.
Ok. Try pip install rocksdict==0.3.24b2
, and
from rocksdict import Rdict
# This will automatically load latest options and column families.
# Note also that this is automatically RAW MODE,
# as it knows that the db is not created by RocksDict.
db = Rdict("db_path")
# list column families
cfs = Rdict.list_cf("db_path")
print(cfs)
# use one of the column families
cf1 = db.get_column_family(cfs[1])
# iterate through all wide columns in cf1
for k, v in cf1.entities():
print(f"{k} -> {v}")
# or query specific entity in cf1
print(cf1.get_entity(b"some_key"))
Tell me if it works.
The logic of rocksdict is that, we do not pass cf argument to any of get, put, iter, get_entity, and etc.
. Instead, use some_cf = db.get_column_family("some_cf_name")
which returns an object with exact identical methods as Rdict
including get, put, delete, get_entity, and etc.
All of these operations returns only data from some_cf