horaedb
horaedb copied to clipboard
General wal deletion model
Describe This Problem
In fact, I want to refactor the whole wal module finally, reasons:
- hard to implement master-follower model based on current wal implementation.
- hard to introduce new component(such as HBase, HDFS...) as the wal storage base.
- messy architecture, we can divide logics to: wal and wal's storage, and all logic in abstract wal can be reused.
However, this is a massive work, I am impossible finish in short term.
And I want to just refactor the deletion part
(the hardest part in wal and in the general wal model in my mind) in Kafka based WAL
as a start.
Proposal
Steps in this deletion model can also be divided into two part: mark deleted and clean.
In mark deleted part:
- we need to create an new
wal file
first (we will simulate this in Kafka)- If something wrong, we just ignore the error and still use the old file.
- If every thing is ok , we switch to the new file.
- mark the flushed file to the flushed table.
In clean part:
- we just scan all table metadatas in the region, compare the flushed files of them and find the one with the smallest file number.
- make snapshot of table metadatas first.
- remove all files whose file nubmer less than the smallest file number then.
Additional Context
The development project:
- [ ] impl page manager first.
- [ ] make use of the page manager.
No response
Hello, I want to challenge this seemingly difficult job. Is there anything I can do right now?
Hello, I want to challenge this seemingly difficult job. Is there anything I can do right now?
We haven't finished the plan because of the low priority of it. In fact, the most annoying thing is how to make it compatible with the old wal module... because ceresdb has release the 1.0 version, the breaking changes are unacceptable in short term...
We are doing works about supporting influxql now, if you are interested welcome to join it.
Of course, I'm interested in this and will try to do some work after I implement #558 .