RocksDict icon indicating copy to clipboard operation
RocksDict copied to clipboard

merge operators

Open Congyuwang opened this issue 1 year ago • 2 comments

Been using this library for a year and loving it. Is there any plan on integrating merge operators into this library? I have a few use cases that would really benefit from them.

Originally posted by @hootie-hoo in https://github.com/rocksdict/RocksDict/discussions/159

Congyuwang avatar Nov 23 '24 01:11 Congyuwang

I did some research. And it seems to me that merge operator is rather complicated. There is this distinction between full_merge and partial merge. How would you design the python interface?

Congyuwang avatar Dec 16 '24 22:12 Congyuwang

Hi @Congyuwang 👋

Thank you for all your work on this project! I couldn't be more excited to start building with rocksdict.

I have a few thoughts on how we might approach merge support:

  • it might be good to start with supporting merge operators in raw mode first
    • defining merge operators in python adds a lot of project complexity
    • the runtime overhead of executing python during compaction would likely be significant (unless we can compile down to pure cpp using cython & still keep python interoperability? Maybe there's an approach here that I'm missing?)
    • We might be able to support python classes sourced from rust using pyO3, where we can then define the merge operators in rust. This adds a good bit of complexity for the user though...
    • I think supporting merge in raw mode would be more straightforwards than supporting merge in default mode, because the user could define the merge operators in rust- without worrying about any python<->rust bindings (the operator in rust is only implicitly coupled with their bytes usage in python)

Here's what I'm thinking might be a good order of progression for supporting merge in raw mode: ~~1. support existing [built-in merge operators ](https://github.com/facebook/rocksdb/tree/main/utilities/merge_operators opt.set_merge_operator('string_append', ','))~~ 2. support defining custom merge operators in a user-owned rust lib, and dynamically load the operator functions which can then be referenced by name from python. This might look like: - user defines a custom operator function that implements an operator trait defined in the rust-rocksdb fork - user loads the lib containing their custom merge functions using a rocksdict option/function (dynamic loading?) - user specifies one of the custom merge functions when creating a db->opt.set_merge_operator('concat_merge')

Let me know what your thoughts are! Happy to help contribute here.

Edit: strike through inapplicable use of default operators for raw mode

0trust3r avatar Mar 16 '25 04:03 0trust3r