rangemap Observing range reductions and splits as the result of an insertion

Motivation

I have a use-case where I need to be able to observe and mutate the ranges that:

Are the result of a split, or
whose size is reduced.

That is, I don't care about the ranges that are completely replaced by the new one, only the ones that get reduced in size or are split in half. This would be useful to ensure that the ranges in the map always have a unique value. This is also useful to entirely avoid the V: Clone requirement for insertions (.clone() may not have enough context for the type).

As far as I can tell, there isn't a way to do this efficiently (either before or after the insertion). Happy to be proven wrong on this!

API

I admit this might be a pretty niche use-case, but if you think it has a spot in this crate, then here's a possible interface for it:

pub enum MaybeSplit<'a, T> {
    /// The range needs to be split. Give a reference to one side of the split,
    /// so it can be cloned.
    Split(&'a T),
    /// The range size needs to be reduced. Allow the user to get ownership and
    /// remap the value.
    Reduced(T),
}

impl<K, V> RangeMap<K, V>
where
    K: Ord + Clone,
    V: Eq,
{
    pub fn insert_with<F>(&mut self, range: Range<K>, value: V, f: F)
    where
        F: FnMut(MaybeSplit<V>) -> V,
    {
        todo!()
    }
}

Here, f would be called if the insertion would trigger one of the two scenarios above. f would take the range's value and map it to a new value. Given the circumstances where it might be called, I believe it can only be called up to 2 times.

RangeMap::insert could also be implemented in terms of insert_with, where f would just clone in the MaybeSplit::Split case.

Example

let mut map: RangeMap<u32, u32> = RangeMap::new();
let mut id = 0;

map.insert(0..10, id);
id += 1;
map.insert(10..20, id);
id += 1;

map.insert_with(8..12, id, |v| {
    match v {
        // Give a new ID to ranges that get split.
        MaybeSplit::Split(_) => {
            let new_id = id;
            id += 1;
            new_id
        }
        // Keep the same value for reduced ranges.
        MaybeSplit::Reduced(v) => v,
    }
});
id += 1;

I'm definitely open to feedback on this interface. I think insertions/deletions are the only legitimate case where V needs Clone, so maybe this could also be a way to avoid that requirement. A similar interface could be implemented for RangeMap::remove.

As always, I'm happy to implement all of this if this seems reasonable.

Feb 06 '23 23:02 jasonwhite

Thanks for writing this up, @jasonwhite. I'm definitely open to adding something along these lines. I have some concerns and questions about the API around how it would interact with coalescing, and whether to instead go for two separate callbacks. I'll write up some examples when I get a moment.

If we can do something that works well with coalescing the way it is now (i.e. still mandatory) then I'd be happy to land it whenever — I'm just wary of adding a new API that I might need to change soon. I suppose we could land something behind a feature gate if you'd rather have something in sooner rather than later and can tolerate a bit of churn if I/we need to tweak the API later.

Feb 07 '23 05:02 jeffparsons

Thanks for the quick response!

The coalescing part might be tricky, but I'll try to make sure that works. If adjacent ranges can have their values change, then I believe they need to be compared with the next range over. Cursors would probably make that easier. I'll play around with with an implementation this week.

I think a feature flag for this makes sense.

Feb 07 '23 19:02 jasonwhite

The coalescing part might be tricky, but I'll try to make sure that works.

I couldn't make it work with coalescing, but here's what I implemented: https://github.com/jasonwhite/rangemap/commit/15309d158a6951441fe4a677bef65cd13136a62b

Unfortunately, that implementation falls short in a couple of ways:

It doesn't handle coalescing. I think if this is really a requirement, then we'll need the BTreeMap cursors API to be able to look at adjacent ranges. Otherwise, it just seems too challenging right now to be worth it.
It doesn't remove the need for V: Clone, which was one of my other motivations for this feature. This can be solved without cursors, but cursors would make it easier.

Because of these challenges, I decided to implement my own simple range map for my particular niche use-case. (It only took a day to implement everything I needed.)

While testing it on my use-case, I realized I also need to observe the ranges that get removed, so I am now pretty convinced that this is the right API for this feature:

/// A trait for observing and mutating the values that are affected by an
/// insertion into a range map.
pub trait ValueMapper<T> {
    /// The range is being split. The given value is a reference to the left
    /// side of the split. This function should return the new value for the
    /// right side. A typical implementation may just clone the left side.
    fn split(&mut self, left_side: &mut T) -> T;

    /// The range has been reduced in size.
    fn reduced(&mut self, _value: &mut T) {}

    /// The value has been removed from the range map.
    fn removed(&mut self, _value: T) {}
}

impl<K, V> RangeMap<K, V>
where
    K: Ord + Clone,
{
    pub fn insert_with<M>(&mut self, range: Range<K>, value: V, mapper: &mut M)
    where
        M: ValueMapper<V>,
    {
        todo!()
    }
}

impl<K, V> RangeMap<K, V>
where
    K: Ord + Clone,
    V: Clone,
{
    pub fn insert(&mut self, range: Range<K>, value: V) {
        let mut mapper = DefaultValueMapper::new();
        self.insert_with(range, value, &mut mapper)
    }
}

Notes:

I'm not fond of the name ValueMapper. Naming things is hard, yo.
DefaultValueMapper is just a simple impl of ValueMapper that requires V: Clone and just clones the value in the split impl.
Having this ValueMapper trait seems better than having multiple FnMuts being passed to insert_with.

Feb 09 '23 21:02 jasonwhite

Sweet, thanks @jasonwhite.

Now that I've seen your API I do prefer the all-in-one trait approach for callbacks, primarily because I think it will make client code more readable.

I'm open to introducing this API even before having cursors, and just whacking a big ol' warning in the docs for anything that will currently be super inefficient. I.e. bet on a future that has cursors in it, and at least offer the convenience of this API to people who don't care about the slowness for now.

In a few days I may be able to tinker with this a bit. In particular I want to see if I can find a nice way for optional coalescing to fit in to this. New philosophy being: if you don't need it, you shouldn't have to pay for it, but if you do need it then it should be cheap — at least once cursors land. 😅

Bonus ramblings: I'm considering doing a parallel re-implementation using cursors in-crate. Then I can easily test and benchmark the new implementation against the old, and if you turn on the "unstable" feature then you automatically get better performance as well. And then somewhere down the line we can just delete the old one and pretend it never happened.

Feb 11 '23 02:02 jeffparsons

rangemap rangemap copied to clipboard

Observing range reductions and splits as the result of an insertion

Motivation

API

Example

rangemap
rangemap copied to clipboard