agatedb icon indicating copy to clipboard operation
agatedb copied to clipboard

iterator: unify behavior and implement `prev` and `to_last` to all Iterators

Open GanZiheng opened this issue 2 years ago • 8 comments

Signed-off-by: GanZiheng [email protected]

Unify all kinds of iterators which implement the AgateIterator trait.

Suppose we have another two extra elements in the data which the iterator iterates over, one is before the original first element, called before first, and another is after the last original element, namely after last.

  1. If we arrives the first element, and call prev, we will arrive before first, and valid will return false;
  2. If we arrives the last element, and call next, we will arrive after last, and valid will return false;
  3. At before first, if we call prev, nothing will change, and if we call next, we will move to the first element;
  4. At after last, if we call next, nothing will change, and if we call prev, we will move to the last element;

When entries are set but not committed yet, they will stay in pending_writes of the transaction, and if we build PendingWritesIterator to iterate over them, the timestamp of each entry is set to read timestamp of the transaction. So, we may get duplicate keys in MergeIterator of Iterator generated by Transaction. ~~However, the logic of prev in MergeIterator is complicated if there exists duplicated keys. So I also update the timestamp appended to key. When writing an entry, we will append commit timestamp * 10 to the key and we will use read timestamp * 10 + 5 to search an entry or build PendingWritesIterator.~~ Since all keys are different in PendingWritesIterator and other iterators respectively, to a specific key, there may be at most one pair of identical keys in all iterators. We can simplify logic of MergeIterator with this limitation.

GanZiheng avatar Jul 21 '22 07:07 GanZiheng

Why not define seek_to_first and seek_to_last just like what rocksdb does?

BusyJay avatar Jul 21 '22 08:07 BusyJay

Why not define seek_to_first and seek_to_last just like what rocksdb does?

Currently, rewind is the same as seek_to_first, and to_last added int this pull request is actually equivalent with seek_to_last. maybe I should rename these methods?

GanZiheng avatar Jul 21 '22 08:07 GanZiheng

Codecov Report

Merging #184 (b7138ea) into master (475e17b) will increase coverage by 1.19%. The diff coverage is 98.55%.

@@            Coverage Diff             @@
##           master     #184      +/-   ##
==========================================
+ Coverage   89.32%   90.52%   +1.19%     
==========================================
  Files          39       39              
  Lines        8752     9487     +735     
==========================================
+ Hits         7818     8588     +770     
+ Misses        934      899      -35     

codecov[bot] avatar Jul 21 '22 08:07 codecov[bot]

Generally LGTM. But I wonder if there is any use case inside AgateDB that we will need to do both forward iterating and backward iterating on a single iterator? If not, we can use const generics to implement forward/backward iterator using one piece of code. It will also simplifies the case on how to seek.

skyzh avatar Jul 27 '22 03:07 skyzh

… some examples: https://github.com/singularity-data/risingwave/blob/main/src/storage/src/hummock/iterator/backward_concat.rs

skyzh avatar Jul 27 '22 03:07 skyzh

Generally LGTM. But I wonder if there is any use case inside AgateDB that we will need to do both forward iterating and backward iterating on a single iterator? If not, we can use const generics to implement forward/backward iterator using one piece of code. It will also simplifies the case on how to seek.

Previously, AgateDB has both forward and reverse iterators, but once the kind of iterator is determined, it could only go ahead but cannot go back. Current I support both forward and reverse iterator to go back.

GanZiheng avatar Jul 27 '22 03:07 GanZiheng

Generally LGTM. But I wonder if there is any use case inside AgateDB that we will need to do both forward iterating and backward iterating on a single iterator? If not, we can use const generics to implement forward/backward iterator using one piece of code. It will also simplifies the case on how to seek.

Previously, AgateDB has both forward and reverse iterators, but once the kind of iterator is determined, it could only go ahead but cannot go back. Current I support both forward and reverse iterator to go back.

Yes, so I'm wondering whether this is necessary... I can hardly come up with a use case that users want to change direction on-the-fly.

skyzh avatar Jul 27 '22 03:07 skyzh

Generally LGTM. But I wonder if there is any use case inside AgateDB that we will need to do both forward iterating and backward iterating on a single iterator? If not, we can use const generics to implement forward/backward iterator using one piece of code. It will also simplifies the case on how to seek.

Previously, AgateDB has both forward and reverse iterators, but once the kind of iterator is determined, it could only go ahead but cannot go back. Current I support both forward and reverse iterator to go back.

Yes, so I'm wondering whether this is necessary... I can hardly come up with a use case that users want to change direction on-the-fly.

In fact, that's because RocksDB supports it and TiKV needs it :rofl:.

GanZiheng avatar Jul 27 '22 03:07 GanZiheng