horaedb
horaedb copied to clipboard
Better strategy to force flush when the memtable size limit reached
Describe This Problem
In the current implementation, the table whose memtable consumes most memory will be forced flush if the space's memory usage limit is reached.
The problems include:
- To find the table with hugest memtable is not a trivial thing if the number of the tables is large.
- After the table is found, only this table will be forced flush, that is to say, the space memory usage may just be reduced a little.
- For every write request, the check whether the space memory usage limit is triggered, which costs too much cpu resources if massive tables exist.
Proposal
Here is a simple proposal to address the problems mentioned above:
- There is no need to find the one table whose memtable is largest, and we can choose a bunch of tables whose memtable is large enough (exceeding half of the table's memtable size limit) to flush.
- There is no need to check the space memory usage every time the write request is executed, and it won't be executed until some requirements are met:
- It has been a while since last check;
- It has been a while since last space-level flush if it occurs;
Additional Context
No response