rocksdb icon indicating copy to clipboard operation
rocksdb copied to clipboard

Why `partial_merge` not called when `get` is called and there are multiple merge operations?

Open curtainp opened this issue 2 years ago • 8 comments

Hi,

I've encountered a scenario where a get operation is performed after a large number of merge operations, and observed that the number of operators in the merge_operator triggered by the get operation has been incrementing, and only the full_merge callback set at initialization was called.

But according to the wiki description, if there are multiple merge operations when the get operation is executed, it should first call partial_merge to try to merge the operands, so why does this logic only trigger when the compaction is performed (tested)?

curtainp avatar Dec 08 '22 01:12 curtainp

But according to the wiki description, if there are multiple merge operations when the get operation is executed, it should first call partial_merge to try to merge the operands

Can you point to the wiki which says Get() will trigger this?

Which merge operator is being used? Does it implement PartialMerge() or PartialMergeMulti()?

riversand963 avatar Dec 12 '22 18:12 riversand963

But according to the wiki description, if there are multiple merge operations when the get operation is executed, it should first call partial_merge to try to merge the operands

Can you point to the wiki which says Get() will trigger this?

Which merge operator is being used? Does it implement PartialMerge() or PartialMergeMulti()?

I found it from the official wiki that https://github.com/facebook/rocksdb/wiki/Merge-Operator-Implementation

says that when Get() called, Pseudocode Logic like this:

Get(key):
  Let stack = [ ];       // in reality, this should be a "deque", but stack is simpler to conceptualize for this pseudocode
  for each entry OPi from newest to oldest:
    if OPi.type is "merge_operand":
      push OPi to stack
        while (stack has at least 2 elements and (stack.top() and stack.second_from_top() can be partial-merged)
          OP_left = stack.pop()
          OP_right = stack.pop()
          result_OP = client_merge_operator.PartialMerge(OP_left, OP_right)
          push result_OP to stack
    else if OPi.type is "put":
      return client_merge_operator.FullMerge(v, stack);
    else if v.type is "delete":
      return client_merge_operator.FullMerge(nullptr, stack);

  // We've reached the end (OP0) and we have no Put/Delete, just interpret it as empty (like Delete would)
  return client_merge_operator.FullMerge(nullptr, stack);

and i use MergeOperator also implement PartialMerge() myself.

But in the end only the FullMerge() called, unless i manually trigger the compaction.

curtainp avatar Dec 13 '22 01:12 curtainp

How about PartialMergeMulti()?

riversand963 avatar Dec 13 '22 20:12 riversand963

I think that wiki is outdated. That pseudocode is nine years old. I believe we intend to call FullMergeV2() when the operand list is known to include the oldest possible operand/base value, as is the case during Get().

ajkr avatar Dec 13 '22 20:12 ajkr

How about PartialMergeMulti()?

not yet, i will try later.

curtainp avatar Dec 14 '22 01:12 curtainp

I think that wiki is outdated. That pseudocode is nine years old. I believe we intend to call FullMergeV2() when the operand list is known to include the oldest possible operand/base value, as is the case during Get().

I think so too, and now it is true that FullMergeV2() is called, but does ParitialMerge() only work at the time of compaction?

curtainp avatar Dec 14 '22 01:12 curtainp

I think that wiki is outdated. That pseudocode is nine years old. I believe we intend to call FullMergeV2() when the operand list is known to include the oldest possible operand/base value, as is the case during Get().

I have encountered a similar phenomenon and the performance of reading is relatively poor cause of only FullMerge being called.

so I want to figure out if a Get operation would not trigger a PartialMerge.

Learndeligent avatar Jan 09 '24 08:01 Learndeligent

Your finding is expected and correct. Only full merge should be called during Get(). The merge needs to be full to return a result; adding partial merge during Get() is unlikely to help

ajkr avatar Jan 09 '24 17:01 ajkr