datastructures icon indicating copy to clipboard operation
datastructures copied to clipboard

decrease-key

Open wlandau opened this issue 7 years ago • 28 comments

I am interested in using a priority queue for ropensci/drake#227, and I am really happy to have found this package. Is there a decrease-key method available? I could not find one in the documentation, and I will need it to modify the priorities in the queue as jobs complete at unpredictable times in a custom scheduler.

wlandau avatar Feb 13 '18 21:02 wlandau

cc @krlmlr

wlandau avatar Feb 13 '18 21:02 wlandau

Hey,

no, it is not yet there, because I never really found use for it. I'll implement it asap. Shouldn't take too long, since I am only interfacing the Boost methods anyways. However, I think I recently submitted to CRAN which recommends submitting only every 2 months afaik. If you are fine with installing from GitHub directly, this isn't an issue though.

Cheers, S

dirmeier avatar Feb 13 '18 21:02 dirmeier

Awesome! That would be fantastic!

I think I can get by with a GitHub dependency for awhile. Drake is in the middle of heavy development, and we are trying to iron out some big issues.

The CRAN policies say 1-2 months, and in the past, I have submitted monthly updates with no complaints from CRAN. But how often you submit is up to you.

wlandau avatar Feb 13 '18 22:02 wlandau

Hey, the latest release adds a decrease-key method. If you have a look at the vignette, you can see how the method is used. Let me know if this works out for you, then I'd submit to CRAN.

dirmeier avatar Feb 19 '18 18:02 dirmeier

Fantastic, @dirmeier! I will look at it ASAP this week.

wlandau avatar Feb 19 '18 18:02 wlandau

Thanks! I forgot to mention that drake could benefit from a priority queue with integer keys, I believe they can be implemented with decrease_key() in O(1)?

krlmlr avatar Feb 20 '18 07:02 krlmlr

Hey, in the amortized complexity table you can see the data structures that boost::heap implements and their respective runtimes. For the heap namespace such a data structure does not exist. I guess you are referring to Thorup's paper?

dirmeier avatar Feb 20 '18 10:02 dirmeier

I forgot about that paper, but it does look useful. Maybe elsewhere in the huge boost universe?

krlmlr avatar Feb 20 '18 16:02 krlmlr

So, from a quick look into the stl and Boost data structures I couldn't find something that would suit you.

However, my plan in the beginning was anyway to implement most of these into one large package and enforce a class hierarchy similar to Java's (on the R side). So maybe something useful will turn up in future Boost releases. For now, I don't want to implement these data structures myself.

dirmeier avatar Feb 21 '18 21:02 dirmeier

@dirmeier Thanks for your prompt work! I looked at the vignette, and the usage of decrease_key() is super clear. I think we're almost there. Unfortunately, for the purposes of ropensci/drake#227, we won't know the keys we want to decrease in advance. We start with a vector of values (in this case, a bunch of target names), and we need to decrease the keys of all the nodes corresponding to those values. And I expect many of those nodes to have the same keys. Is this possible?

wlandau avatar Feb 22 '18 02:02 wlandau

Hey, yeah that works. If you call handle you also get the value of the node. So you'd look at the value and take the handle to decrease the key, like that:

handle(fheap, -15)

## [[1]]
## [[1]]$handle
## [1] "1ab5c4ef-c3eb-4f91-b0fc-c6262c434ec6"
## 
## [[1]]$value
## [1] "a" "b" "c" "d"
## 
## 
## [[2]]
## [[2]]$handle
## [1] "b8f51bfb-a7cd-476c-9bf2-e3256abf0d41"
## 
## [[2]]$value
## [1] "a"

For example, to decrease the key-value pair -15/"a" to -16, you take handle "b8f51bfb-a7cd-476c-9bf2-e3256abf0d41" and call decrease_key(fheap, from=-15, to=-16, handle="b8f51bfb-a7cd-476c-9bf2-e3256abf0d41"). I don't like that solution very much, because you always need to do some sort of comparison with the value to be able to get to the handle. Comparing doubles for equality is usually a bad idea, so the user would need to figure out something on his own here. If your target names are integers or characters, this isn't a problem though.

However, if you have any suggestions to simplify this process, I'd be glad to hear them :)

dirmeier avatar Feb 22 '18 09:02 dirmeier

Is it possible to look up a handle using a value rather than a key? That's really what I'm after.

handle(fheap, value = "a")

or even vectorize this somehow?

keys <- handle(fheap, values = c("a", "b", "c", "d")) %>%
  purrr::map_int(f = function(x) x$key)
decrease_key(fheap, from = keys, to = keys - 1)

wlandau avatar Feb 22 '18 12:02 wlandau

Sure, good idea, I'll add this. Thanks for the feedback.

dirmeier avatar Feb 22 '18 12:02 dirmeier

Hey, I've added this feature preliminarily with v0.2.3. I'll further optimize/modularize the code on the weekend, currently I am rather busy... Hope that works.

From the vignette:

h <- handle(fheap, value="V-10")
h
## [[1]]
## [[1]]$handle
## [1] "cac83f38-7a5d-4082-8fd3-0484b4858a98"
## 
## [[1]]$key
## [1] 10
## 
## 
## [[2]]
## [[2]]$handle
## [1] "e066ef06-b0fd-462e-8ca3-c24d48614a15"
## 
## [[2]]$key
## [1] 10

decrease_key(fheap, 10, -100, h[[1]]$handle)

Note: CXX_STD = CXX14 now, that's why travis complains..

dirmeier avatar Feb 27 '18 08:02 dirmeier

Ill submit this to CRAN once I've added some more methods for all the classes. Shouldn't take too long though..

dirmeier avatar Feb 27 '18 08:02 dirmeier

Looks great! All I would ask is for handle() and decrease_key() to iterate over multiple values/handles, but that is just extra. Looking forward to the new CRAN release.

wlandau avatar Feb 27 '18 14:02 wlandau

Forgot to ask for one last thing: how do I list all the values in an Fibonacci heap? Turns out I will need this too if I am going to use datastructures.

wlandau avatar Feb 27 '18 15:02 wlandau

Usually, this is not a function common for Fibonacci heaps, because it somewhat hurts the purpose of a priority queue, but if helps, I'll add it.

Thanks for all the feedback, this is really valueable. :)

dirmeier avatar Feb 27 '18 16:02 dirmeier

Glad all this is helping you too. How do you suppose we can fix the CXX_STD = CXX14 error?

wlandau avatar Mar 05 '18 16:03 wlandau

Hey, this shouldn't be an issue. I think if we choose a compiler version that supports C++14 for travis, it will be fine. If travis does not have, returning to C++11 is also not a problem. Sorry for the delay, I am currently quite busy..

dirmeier avatar Mar 07 '18 08:03 dirmeier

Hey, this tag added getting all values from a fibonacci heap.

Furthermore, in order to make the package suitable to randy3k's question https://github.com/dirmeier/datastructures/issues/5 I needed to modify the handle method for values.

h <- handle(fheap, value="V-10")

This operation before ran in constant time, but required storing the values in an additional map, next to the actual heap data structure. Now it iterates over all nodes in the heap to get to a specific value. Is the value found the handle is returned. For your scenario this is not too optimal, but I find it more reasonable that way. However, to solve this for you, I added a multimap data structures that allows storing multiple identical key->value pairs (see cppreference). So I would advise using this (or a hashmap, bimap or R environment) to store value-> node handle/id pairs and then use this for decreasing single nodes in the heap.

Upon re-reading I am not sure if it got clear what I meant. :D Please let me know if that helped. Thanks!

Cheers, Simon

dirmeier avatar Mar 11 '18 22:03 dirmeier

Thanks, Simon. I hope to dive back into this soon. I am planning a workers package that will use a priority queue from datastructures.

wlandau avatar Mar 12 '18 03:03 wlandau

Cool, glad to hear the package has use to someone. I'll put v0.2.5 on CRAN the in the meantime until I have time to work on issue https://github.com/dirmeier/datastructures/issues/5

dirmeier avatar Mar 12 '18 08:03 dirmeier

Dear @wlandau,

in order to be able to use arbitrary R objects with heaps, I think we need to drop the

handle(fheap, value="V-10")

method. Reason being that, while I can easily compare vectors for equality, this doesn't work if I use SEXPs. Are you still making use of this?

dirmeier avatar Apr 22 '18 09:04 dirmeier

Thanks for letting me know, Simon. Unfortunately, this probably means I will not be able to use datastructures as a priority queue for parallel workers.

wlandau avatar Apr 24 '18 03:04 wlandau

The work around I've implemented so far is to get all values from the heap and then keep only the relevant items, e.g. using purrr::keep:

  values(heap) %>%
    purrr::keep(~ .x$value == whatever you want) %>%
    purrr::map_chr(~.x$"handle")

That puts more burden on the user, but will should solve the problem. Once I am done implementing this, I come back to you.

dirmeier avatar Apr 24 '18 07:04 dirmeier

Hi Will,

just forget about my last post. Turns out the C API has a function for comparison of SEXPs. So you can use the handle as before.

hands <- handle(fheap, value=list("V1", list(a=2)))
hands

## [[1]]
## [[1]]$handle
## [1] "handle-0"
## 
## [[1]]$key
## [1] -1000
...

I still advise to use this cautiously, for non-primitive values, for instance data frames or matrices. I'll release this to CRAN in the next days.

dirmeier avatar Apr 29 '18 14:04 dirmeier

I am really glad to hear it. I was only planning on using primitive values anyway, and I know the struggle of different users pulling in different directions. Thanks for accommodating.

wlandau avatar Apr 29 '18 21:04 wlandau