wickdb icon indicating copy to clipboard operation
wickdb copied to clipboard

Feature: support concurrently insert batches into memtable

Open Fullstop000 opened this issue 5 years ago • 9 comments
trafficstars

In the origin version of leveldb, the write path is like below:

          write
  write     +         write
    +       |   write   +
    |       |     +     |
    |       |     |     |
    |       |     |     |
+---v-------v-----v-----v---+
|                           |
|        Mutex<queue>       |
|                           |
+-------------+-------------+
              |
              |
              |  group batch
              | <single thread>
              |
              v
     +--------+--------+
     |                 |
     |     Memtable    |
     |                 |
     +-----------------+

Writing batches concurrently will result in a single thread ingestion into memtable and then notify all the relate blocked write request threads. Maybe we can support concurrently inserting batches into memtable , which should improve the write performance a lot.

This feature includes following tasks:

  • [x] Introduce an thread-safe concurrent skiplit (https://github.com/Fullstop000/wickdb/pull/85)
  • [ ] Make the current queue scale with inserting threads

Fullstop000 avatar Sep 10 '20 14:09 Fullstop000

The current Skiplist approach doesn't support concurrent inserting. Maybe we should port InlineSkipList from rocksdb https://github.com/facebook/rocksdb/blob/master/memtable/inlineskiplist.h

Fullstop000 avatar Sep 14 '20 08:09 Fullstop000

The current Skiplist approach doesn't support concurrent inserting. Maybe we should port InlineSkipList from rocksdb https://github.com/facebook/rocksdb/blob/master/memtable/inlineskiplist.h

I would like to try on this.

jayzhan211 avatar Sep 18 '20 06:09 jayzhan211

@accelsao Also you may refer to https://github.com/dgraph-io/badger/blob/master/skl/skl.go, which looks like a more clean version of the rocksdb's

Fullstop000 avatar Sep 21 '20 10:09 Fullstop000

badger arena is different from rocksbdb, do you know which one is better to write in rust? badger save node offset instead of a pointer.

jayzhan211 avatar Sep 23 '20 06:09 jayzhan211

badger arena is different from rocksbdb, do you know which one is better to write in rust? badger save node offset instead of a pointer.

I think they serve the same purpose with a different approach. Using the current arena should be enough.

Fullstop000 avatar Sep 24 '20 02:09 Fullstop000

How do you think about crossbeam/skiplist, is it a good way to achieve concurrent insert for memtable?

jayzhan211 avatar Sep 29 '20 12:09 jayzhan211

How do you think about crossbeam/skiplist, is it a good way to achieve concurrent insert for memtable?

It may need some extra work to fit the Comparator

Fullstop000 avatar Oct 10 '20 02:10 Fullstop000

@accelsao I've tested the offset-based Arena and it seems more suitable with your inline skiplist implementation. Do you have time to bring it to your current PR? Maybe I can help :)

Fullstop000 avatar Oct 14 '20 10:10 Fullstop000

Can you help me? I am quite busy now QQ

jayzhan211 avatar Oct 15 '20 06:10 jayzhan211