horaedb icon indicating copy to clipboard operation
horaedb copied to clipboard

Better strategy to force flush when the memtable size limit reached

Open ShiKaiWi opened this issue 2 years ago • 0 comments

Describe This Problem

In the current implementation, the table whose memtable consumes most memory will be forced flush if the space's memory usage limit is reached.

The problems include:

  • To find the table with hugest memtable is not a trivial thing if the number of the tables is large.
  • After the table is found, only this table will be forced flush, that is to say, the space memory usage may just be reduced a little.
  • For every write request, the check whether the space memory usage limit is triggered, which costs too much cpu resources if massive tables exist.

Proposal

Here is a simple proposal to address the problems mentioned above:

  • There is no need to find the one table whose memtable is largest, and we can choose a bunch of tables whose memtable is large enough (exceeding half of the table's memtable size limit) to flush.
  • There is no need to check the space memory usage every time the write request is executed, and it won't be executed until some requirements are met:
    • It has been a while since last check;
    • It has been a while since last space-level flush if it occurs;

Additional Context

No response

ShiKaiWi avatar Oct 26 '23 06:10 ShiKaiWi