horaedb icon indicating copy to clipboard operation
horaedb copied to clipboard

Introduce a new disk-based WAL implementation for standalone deployment

Open jiacai2050 opened this issue 2 years ago • 2 comments

Describe This Problem

After #1272, we have successfully put different WAL implementation behind feature gates, this is important to reduce compile time since wal based on RocksDB is very slow.

However, we have to enable rocksdb wal by default since we have no reliable WAL implementation, message queue and table-kv introduce more complex, so a better choice is to implement another wal based on disk directly.

Proposal

The main trait to implement is WalManager

  • https://github.com/CeresDB/ceresdb/blob/affa2c65a279252d10475c6a6d201cd7f60b5689/src/wal/src/manager.rs#L320

A simple structure I think of is like this:

$ tree wal
wal
└──region-1
   ├──f1
   ├──f2
   └──....

WAL of different tables is saved together in a fixed-size file(such as 64M), this have the advantage of fast write.

As for deletion, since different tables are saved intertwined, we need to loop all tables to get the minimal sequence location to delete.

The layout of each file could refer prometheus design: https://github.com/prometheus/prometheus/blob/main/tsdb/docs/format/wal.md

Additional Context

We will finish this task with help of @dracoooooo at OSPP, TODOs are listed below:

  • [ ] Implement a disk-based WAL that can pass the existing unit tests.
    • [x] write
    • [x] read
    • [x] scan
    • [ ] delete
    • [ ] multiple segments
  • [x] Remove unwarp and handle errors.
  • [x] Add unit tests for the new code.
  • [ ] Test on large-scale data.
  • [ ] Compare with the existing RocksDB WAL implementation and optimize performance. No response

jiacai2050 avatar Oct 24 '23 12:10 jiacai2050

@jiacai2050 Should the replication of wal be taken into considerations?

ShiKaiWi avatar Oct 25 '23 08:10 ShiKaiWi

Do you mean distributed WAL storage?

If it's, I don't think we need a component like this, kafka works well for this case.

jiacai2050 avatar Oct 26 '23 07:10 jiacai2050