js-sdsl icon indicating copy to clipboard operation
js-sdsl copied to clipboard

Get index of key-value pair in OrderedMap

Open stanf0rd opened this issue 3 years ago • 5 comments

Is your feature request related to a problem? Please describe. I have such problem: I want to know how much values are between of two keys.
I have a timeline of thousands of events, and I use an OrderedMap (key is timestamp, value — event data), because I draw a line chart for them. I want to give a user an opportunity to know, how many events are between two chosen points.

Describe the solution you'd like If I'd have an index of an item, I could do this:

// eventMap is an OrderedMap
const leftIndex = eventMap.getIndexByKey(leftKey)
const rightIndex = eventMap.getIndexByKey(rightKey)
const eventsBetween = rightIndex - leftIndex - 1

Describe alternatives you've considered I could also have a simple method like OrderedMap.prototype.itemsBetweenKeys:

// eventMap is an OrderedMap
const eventsBetween = eventMap.itemsBetweenKeys(leftKey, rightKey)

stanf0rd avatar Aug 28 '22 19:08 stanf0rd

Thanks for your suggestion, but we have no idea how to implent it with time complexy less or equal than O(logn).

You know we have bidirectional iterator in map, but it is not a random access iterator. We could not konw the position of the iterator exactly.

In lowerBound, we just use the idea of binary search.

For this question, I think you can write a function with O(n).

Like this:

function getRangeArr(leftKey, rightKey) {
    // get the first element greater or equals to leftKey
    const leftIterator = eventMap.lowerBound(leftKey);
    // get the first element less or equals to rightKey
    // or use reverseUpperBound
    const rightIterator = eventMap.reverseLowerBound(rightKey);
    const end = eventMap.end();
    let arr = 0;
    while (leftIterator.equals(end) === false && leftIterator.equals(rightIterator) === false) {
        const key = leftIterator.equals.pointer;
        arr.push(key);
        leftIterator = leftIterator.next();
    }
    return arr;
}

ZLY201 avatar Aug 29 '22 03:08 ZLY201

Today I suddenly had a good idea to implement this function, maybe we can record the size of the subtree to get the index value of the node in the tree.

We will reconsider this feature.

Thanks :)

ZLY201 avatar Aug 30 '22 05:08 ZLY201

Since this function increases the complexity of insertion and deletion (add recount function), we need to test its performance.

We may not pass this PR if it proves to have a performance impact.

ZLY201 avatar Sep 02 '22 15:09 ZLY201

I benchmarked this PR:

  • Environment:
Darwin 21.6.0 arm64
Node.JS 16.14.0
V8 9.4.146.24-node.20
Apple M1 Pro × 10
  • Before change:

insert

js-sdsl x 15,246 ops/sec ±3.54% (89 runs sampled)

remove

js-sdsl x 389,348 ops/sec ±0.50% (99 runs sampled)
  • After change:

insert

js-sdsl x 12,444 ops/sec ±4.05% (91 runs sampled)

remove

js-sdsl x 387,418 ops/sec ±0.35% (96 runs sampled)

You can see that this has a big impact on the performance of the insert function.

The biggest problem is to modify the subtree size of each node from the bottom up for each insertion.

I'm trying to get a better way to implement this function.

Today I recalled the related knowledge of line segment tree, maybe we can label each node with lazy tag like the optimized line segment tree.

Do this we don't need to change the subtree size each time but update it when user use the index property.

This will not affect the performance of the insert function, and the time complexity of getting the index is still guaranteed to be O(logn).

This seems to work. I'll try it later.

ZLY201 avatar Sep 10 '22 10:09 ZLY201

I benchmarked this PR:

  • Environment:
Darwin 21.6.0 arm64
Node.JS 16.14.0
V8 9.4.146.24-node.20
Apple M1 Pro × 10
  • Before change:

insert

js-sdsl x 15,246 ops/sec ±3.54% (89 runs sampled)

remove

js-sdsl x 389,348 ops/sec ±0.50% (99 runs sampled)
  • After change:

insert

js-sdsl x 12,444 ops/sec ±4.05% (91 runs sampled)

remove

js-sdsl x 387,418 ops/sec ±0.35% (96 runs sampled)

You can see that this has a big impact on the performance of the insert function.

The biggest problem is to modify the subtree size of each node from the bottom up for each insertion.

I'm trying to get a better way to implement this function.

Today I recalled the related knowledge of line segment tree, maybe we can label each node with lazy tag like the optimized line segment tree.

Do this we don't need to change the subtree size each time but update it when user use the index property.

This will not affect the performance of the insert function, and the time complexity of getting the index is still guaranteed to be O(logn).

This seems to work. I'll try it later.

Failed.

ZLY201 avatar Sep 10 '22 15:09 ZLY201

We managed to implement this feature without compromising performance and expect to release a new version next week, sorry for keeping you waiting.

See https://github.com/js-sdsl/js-sdsl/pull/14.

@stanf0rd

ZLY201 avatar Sep 20 '22 06:09 ZLY201

New version 4.1.5-beta.0 has been published with index enabled.

We will release the official version next week if no bug reports.

ZLY201 avatar Sep 23 '22 05:09 ZLY201

@ZLY201 Thank you so much for that feature. It will make my experience of using your library so much better 🔥

stanf0rd avatar Sep 23 '22 10:09 stanf0rd

@ZLY201 Thank you so much for that feature. It will make my experience of using your library so much better 🔥

Sorry, there's a bug.

See https://github.com/js-sdsl/js-sdsl/issues/47.

We will fix it soon.

ZLY201 avatar Sep 23 '22 10:09 ZLY201

Version 4.1.5 published.

ZLY201 avatar Oct 01 '22 06:10 ZLY201