copycat icon indicating copy to clipboard operation
copycat copied to clipboard

Remove indexing in logs

Open kuujo opened this issue 8 years ago • 0 comments

The current implementation of Copycat's log uses in-memory indexes to support random access of entries in the log. However, this feature is a product of the evolution of the log that is no longer necessary and thus represents unnecessary overhead.

Because servers tend to read logs sequentially, Copycat's log should instead expose an iterator-like interface that allows for such sequential reading. When indexes are reset - e.g. when a follower returns a lower index to a leader - iterators should be reset by scanning segments. This is a rare occurrence that typically only takes place shortly after a leader election, so the overhead of scanning a segment is acceptable.

However, removing indexes will also require refactoring the Entry abstraction. The problem is, offsets change when segments are compacted, and the index currently allows indexes to be mapped to offsets when an entry is released for compaction. In order to ensure entries can be mapped back to offsets for compaction, the Entry should instead store a reference to a segment/offset that can be updated by compaction processes.

kuujo avatar Mar 12 '17 08:03 kuujo