bolt icon indicating copy to clipboard operation
bolt copied to clipboard

ponies: Lower write amplification by buffering writes in parents

Open tv42 opened this issue 10 years ago • 1 comments

This is interesting, and I'd kinda love to be able to say I want to write this code, but I can't make that promise. And it's probably a bunch of work. I still wanted to write this down here, if for nothing else than to inspire others.

There's several variations of the underlying ideas, e.g. Fractal tree indexing / Bε-trees / buffered trees etc. They each have own unique little twists, like how exactly the internals of the buffer are managed (just sorted sequence, or index structure, etc) but the gist of them all is this:

Keep a buffer in every inner node, partitioned just like the child pointers. If you're processing a Put, and you're about to descend into child C, instead see if there's room in buffer[C]; if yes, instead store the Put there. Only if buffer[C] fills up, flush it down to the child.

(Of course, a CoW B-tree changes the story a little, as you don't need to preallocate the buffer, but instead add to it while copying the node.)

The above means that there can be several writes before more than one node needs to be dirtied, and this in turn decreases the write amplification.

http://www.tokutek.com/wp-content/uploads/2012/11/Fractal-Tree-Technology-and-the-Art-of-Indexing.pdf http://www.tokutek.com/2012/12/fractal-tree-indexing-overview/ http://www.cs.au.dk/~gerth/papers/alcomft-tr-03-75.pdf http://www.cs.au.dk/~gerth/slides/soda03.pdf

tv42 avatar May 01 '15 01:05 tv42

@tv42 I like the idea of implementing more complex indexing but I think that would become a fork of Bolt. Bolt is at a point of stability where I'd rather keep it as-is and make a new project that optimizes for special cases.

benbjohnson avatar May 01 '15 20:05 benbjohnson