Generational compaction
(This is a draft we're opening for discussion. The bulk of required information on design background, analysis and implementation is in the commits, including some design docs added to the repo. We will flesh this PR out as the feature gets closer to being ready.)
Overview
This PR implements a "generational" storage model in couch_bt_engine, which @janl and I have been working on. Its aim is to improve the performance of compaction on large databases with seldom-changing documents, where every compaction run currently has to copy a mostly-unchanged set of data into the new file.
The generational model splits a shard's data storage into multiple generations, where the usual db.couch file is "generation 0". On compaction, live data in this file is promoted into generation 1. The next time generation 0 is compacted, it does not have to copy the same set of data again has much of it will have been moved to another file.
Further detail on the design and analysis is in design docs we have committed to the repo; see https://github.com/neighbourhoodie/couchdb/blob/feat/generational-compaction/src/couch/doc/generational-compaction. The commit messages give further details about the implementation.
Open questions
- What tests need to be added to adequately cover this functionality and make sure there is no risk of data loss?
- Since we have made compaction parameterised by a generation, how do we now ensure that only one compaction runs per shard at a time? i.e. you do not end up with generations 1 and 2 of the same shard compacting at the same time, since this would break our consistency assumptions.
Testing recommendations
Related Issues or Pull Requests
Checklist
- [ ] Code is written and works correctly
- [ ] Changes are covered by tests
- [ ] Any new configurable parameters are documented in
rel/overlay/etc/default.ini - [ ] Documentation changes were made in the
src/docsfolder - [ ] Documentation changes were backported (separated PR) to affected branches