rust-rocks icon indicating copy to clipboard operation
rust-rocks copied to clipboard

Issue with iterator and lifetimes

Open winstonewert opened this issue 5 years ago • 6 comments

Consider the implementation of next() for Iterator

fn next(&mut self) -> Option<Self::Item> {
        if self.is_valid() {
            let k = self.key();
            let v = self.value();
            self.next();
            Some((k, v))
        } else {
            None
        }
    }

The self.next() line invalidates the returned references from self.key() and self.value(). Thus this iterator is always returning bad references.

But why didn't Rust catch this?

pub fn key(&self) -> &'a [u8] {

key (and value) claims to return a reference with lifetime 'a. But this isn't really accurate, as it should only be valid until the next &mut call on self (probably on .next()). Rust doesn't catch the problem because of the unsafe code used to implement the function (obviously unavoidable.)

winstonewert avatar Apr 29 '20 01:04 winstonewert

Haven't figured out how to express the lifetime restrictions of key & value in Rust.

Is ReadOptions::pin_data solves?

  // Keep the blocks loaded by the iterator pinned in memory as long as the
  // iterator is not deleted, If used when reading from tables created with
  // BlockBasedTableOptions::use_delta_encoding = false,
  // Iterator's property "rocksdb.iterator.is-key-pinned" is guaranteed to
  // return 1.
  // Default: false
  bool pin_data;

bh1xuw avatar May 02 '20 05:05 bh1xuw

It seems to me that key() and value() ought to be simply like:

pub fn key(&self) -> &[u8] {

The references remain valid until the iterator is modified by calling .next()

pin_data looks like it would solve the problem for keys, but at the cost of keeping all the keys in memory which wouldn't work in my case since I'm iterating over a large table and really don't want to keep all those keys in memory. And it wouldn't solve for value() in any case.

I think the problem here is that iterator doesn't allow you to return references that only valid until .next() is called.

winstonewert avatar May 02 '20 15:05 winstonewert

The next() should be

fn next<'k, 'v>(&'a mut self) -> Option<(&'k [u8], &'v [u8])>

When pin_data enabled, 'k is the same as 'a. Or else 'k and 'v live just before next next() call.

Haven't figure out how to express the lifetime restriction in Rust. 🤣

bh1xuw avatar May 11 '20 03:05 bh1xuw

Workaround:

In for _ in it {} blocks: nothing changed.

While collecting for later use:

// `it` here, for later use, you must hold the original `mut it` somewhere.
it.map(|(key, val)| (key.to_vec(), val.to_vec())).collect::<Vec<_>>();

bh1xuw avatar May 15 '20 02:05 bh1xuw

In my case, that would load the whole multi-gb datastore into memory, not a good plan.

A better workaround is to use a while loop with is_valid()

winstonewert avatar May 15 '20 02:05 winstonewert

I will add a link to this issue in README. Many thanks.

bh1xuw avatar May 15 '20 02:05 bh1xuw