rust-lapper icon indicating copy to clipboard operation
rust-lapper copied to clipboard

Specify Range Inclusivity and Exclusivity

Open zaporter opened this issue 2 years ago • 4 comments

I have been using Lapper quite a bit and have realized that there is one area of the API that seems a bit confusing.

Consider

let lap: Lapper<usize,usize> = Lapper::new(vec![
    Interval{start:0, stop:10, val:0},
    Interval{start:20, stop:30, val:1},
]);
assert_eq!(lap.find(10, 20).count(),0);

The documentation for find() states that it Find all intervals that overlap start .. stop If I interpret that as [start,stop) then it is consistent with the definition of Interval:

/// Represent a range from [start, stop)
/// Inclusive start, exclusive of stop
pub struct Interval<I, T>

However, right now I feel like I am walking on eggshells every time I make a query because I have to ensure I am thinking about these inclusivity bounds. Something like the following would take a lot of load off the programmer.

let lap: Lapper<usize,usize> = Lapper::new(vec![
    Interval{start:Inclusive::from(0), stop:Inclusive::from(10), val:0},
    Interval{start:20, stop:30, val:1},
]);
assert_eq!(lap.find(Inclusive::from(10), 20).count(),1); // should just include the first one
assert_eq!(lap.find(10, Inclusive::from(20)).count(),0); // should include neither as second is not inclusive at the start
assert_eq!(lap.find(10, 21).count(),1); // should include second

/\ This is just a quick demo. I am not attached to anything like this and think it is a bit ugly. (I also want to make sure that anyone who upgrades their version is not surprised by new behavior)

You're a more experienced Rust programmer: do you know a more idiomatic way to do this? Are you at all interested in merging something that would allow for explicit inclusivity/exclusivity?

I will happily write all of the code and do all of the leg work to get this merged if we can agree on an API design.

zaporter avatar Sep 18 '22 04:09 zaporter

Edit: This would also enable a reasonable find_point(val:I)->Intervals that include val because find(k,k) doesn't always give obvious results depending on where k is in the interval

zaporter avatar Sep 18 '22 23:09 zaporter

Hi Zack!

I see what you mean, and I think I take it for granted that a "start inclusive, stop exclusive" range is front of mind for everyone since I'm coming at this from a genomics background where that is pretty common.

I would be interested in changing things to allow the user to define the inclusiveness of the range, but I'm not sure how to do that without a performance regression. I'd probably look to have an interval be Interval { range: R, val: T } where R implements https://doc.rust-lang.org/std/ops/trait.RangeBounds.html (or some other base Range trait that I'm maybe not finding right now).

RE your edit - is the main purpose of the range changes just to allow for searching for points specifically? I'd be much more inclined to add just find_point than to rework the inclusiveness. I think it would be worth a bit of effort to explore stabbing interval queries to see if we can be a bit more efficient than just the binary search. I'd be open to PRs for this doing either the obvious thing and using the internal iterators / binary search or something more complicated.

Sorry I've been slow to respond, I've had a lot going in IRL the last few weeks (and upcoming few weeks as well)!

On Sun, Sep 18, 2022 at 5:48 PM Zack @.***> wrote:

Edit: This would also enable a reasonable find_point(val:I)->Intervals that include val because find(k,k) doesn't always give obvious results depending on where k is in the interval

— Reply to this email directly, view it on GitHub https://github.com/sstadick/rust-lapper/issues/19#issuecomment-1250413667, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTGZHL3PTLJ4AM65BRIKM3V66S4PANCNFSM6AAAAAAQPIYPGM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

sstadick avatar Sep 19 '22 13:09 sstadick

I'm glad to hear that you are open to this idea! Thank you for mentioning RangeBounds, I have not used them before. I will play around a bit with the repo and get back to you when I have the start of a reasonable implementation.

Re 'find_point' => No, this is not just for 'find_point'. If done properly, I hope that this might also allow for relaxation of the type bounds for I as well as a more general interface for interval trees.

I will experiment and get back to you (with benchmarks to assess performance differences).

I'm very thankful for your time. I hope all is well!

zaporter avatar Sep 19 '22 15:09 zaporter

I never ended up getting around to this -- the initial attempts were unsuccessful as they required some breaking changes. Passing up the gauntlet if someone else wants to implement this. Otherwise, may want to close as wont-do.

zaporter avatar Apr 22 '24 21:04 zaporter