datastructures
datastructures copied to clipboard
decrease-key
I am interested in using a priority queue for ropensci/drake#227, and I am really happy to have found this package. Is there a decrease-key method available? I could not find one in the documentation, and I will need it to modify the priorities in the queue as jobs complete at unpredictable times in a custom scheduler.
cc @krlmlr
Hey,
no, it is not yet there, because I never really found use for it. I'll implement it asap. Shouldn't take too long, since I am only interfacing the Boost methods anyways. However, I think I recently submitted to CRAN which recommends submitting only every 2 months afaik. If you are fine with installing from GitHub directly, this isn't an issue though.
Cheers, S
Awesome! That would be fantastic!
I think I can get by with a GitHub dependency for awhile. Drake
is in the middle of heavy development, and we are trying to iron out some big issues.
The CRAN policies say 1-2 months, and in the past, I have submitted monthly updates with no complaints from CRAN. But how often you submit is up to you.
Hey, the latest release adds a decrease-key method. If you have a look at the vignette, you can see how the method is used. Let me know if this works out for you, then I'd submit to CRAN.
Fantastic, @dirmeier! I will look at it ASAP this week.
Thanks! I forgot to mention that drake could benefit from a priority queue with integer keys, I believe they can be implemented with decrease_key()
in O(1)?
Hey, in the amortized complexity table you can see the data structures that boost::heap
implements and their respective runtimes. For the heap
namespace such a data structure does not exist. I guess you are referring to Thorup's paper?
I forgot about that paper, but it does look useful. Maybe elsewhere in the huge boost universe?
So, from a quick look into the stl and Boost data structures I couldn't find something that would suit you.
However, my plan in the beginning was anyway to implement most of these into one large package and enforce a class hierarchy similar to Java's (on the R side). So maybe something useful will turn up in future Boost releases. For now, I don't want to implement these data structures myself.
@dirmeier Thanks for your prompt work! I looked at the vignette, and the usage of decrease_key()
is super clear. I think we're almost there. Unfortunately, for the purposes of ropensci/drake#227, we won't know the keys we want to decrease in advance. We start with a vector of values (in this case, a bunch of target names), and we need to decrease the keys of all the nodes corresponding to those values. And I expect many of those nodes to have the same keys. Is this possible?
Hey, yeah that works. If you call handle
you also get the value of the node. So you'd look at the value and take the handle to decrease the key, like that:
handle(fheap, -15)
## [[1]]
## [[1]]$handle
## [1] "1ab5c4ef-c3eb-4f91-b0fc-c6262c434ec6"
##
## [[1]]$value
## [1] "a" "b" "c" "d"
##
##
## [[2]]
## [[2]]$handle
## [1] "b8f51bfb-a7cd-476c-9bf2-e3256abf0d41"
##
## [[2]]$value
## [1] "a"
For example, to decrease the key-value pair -15/"a"
to -16
, you take handle "b8f51bfb-a7cd-476c-9bf2-e3256abf0d41"
and call decrease_key(fheap, from=-15, to=-16, handle="b8f51bfb-a7cd-476c-9bf2-e3256abf0d41")
. I don't like that solution very much, because you always need to do some sort of comparison with the value
to be able to get to the handle
. Comparing doubles for equality is usually a bad idea, so the user would need to figure out something on his own here. If your target names are integers or characters, this isn't a problem though.
However, if you have any suggestions to simplify this process, I'd be glad to hear them :)
Is it possible to look up a handle using a value rather than a key? That's really what I'm after.
handle(fheap, value = "a")
or even vectorize this somehow?
keys <- handle(fheap, values = c("a", "b", "c", "d")) %>%
purrr::map_int(f = function(x) x$key)
decrease_key(fheap, from = keys, to = keys - 1)
Sure, good idea, I'll add this. Thanks for the feedback.
Hey, I've added this feature preliminarily with v0.2.3. I'll further optimize/modularize the code on the weekend, currently I am rather busy... Hope that works.
From the vignette:
h <- handle(fheap, value="V-10")
h
## [[1]]
## [[1]]$handle
## [1] "cac83f38-7a5d-4082-8fd3-0484b4858a98"
##
## [[1]]$key
## [1] 10
##
##
## [[2]]
## [[2]]$handle
## [1] "e066ef06-b0fd-462e-8ca3-c24d48614a15"
##
## [[2]]$key
## [1] 10
decrease_key(fheap, 10, -100, h[[1]]$handle)
Note: CXX_STD = CXX14 now, that's why travis complains..
Ill submit this to CRAN once I've added some more methods for all the classes. Shouldn't take too long though..
Looks great! All I would ask is for handle()
and decrease_key()
to iterate over multiple values/handles, but that is just extra. Looking forward to the new CRAN release.
Forgot to ask for one last thing: how do I list all the values in an Fibonacci heap? Turns out I will need this too if I am going to use datastructures
.
Usually, this is not a function common for Fibonacci heaps, because it somewhat hurts the purpose of a priority queue, but if helps, I'll add it.
Thanks for all the feedback, this is really valueable. :)
Glad all this is helping you too. How do you suppose we can fix the CXX_STD = CXX14
error?
Hey, this shouldn't be an issue. I think if we choose a compiler version that supports C++14 for travis, it will be fine. If travis does not have, returning to C++11 is also not a problem. Sorry for the delay, I am currently quite busy..
Hey,
this tag added getting all values
from a fibonacci heap.
Furthermore, in order to make the package suitable to randy3k's question https://github.com/dirmeier/datastructures/issues/5 I needed to modify the handle
method for values.
h <- handle(fheap, value="V-10")
This operation before ran in constant time, but required storing the values in an additional map, next to the actual heap data structure. Now it iterates over all nodes in the heap to get to a specific value. Is the value found the handle is returned. For your scenario this is not too optimal, but I find it more reasonable that way.
However, to solve this for you, I added a multimap
data structures that allows storing multiple identical key->value
pairs (see cppreference). So I would advise using this (or a hashmap, bimap or R environment) to store value-> node handle/id
pairs and then use this for decreasing single nodes in the heap.
Upon re-reading I am not sure if it got clear what I meant. :D Please let me know if that helped. Thanks!
Cheers, Simon
Thanks, Simon. I hope to dive back into this soon. I am planning a workers
package that will use a priority queue from datastructures
.
Cool, glad to hear the package has use to someone. I'll put v0.2.5 on CRAN the in the meantime until I have time to work on issue https://github.com/dirmeier/datastructures/issues/5
Dear @wlandau,
in order to be able to use arbitrary R objects with heaps, I think we need to drop the
handle(fheap, value="V-10")
method. Reason being that, while I can easily compare vectors for equality, this doesn't work if I use SEXPs. Are you still making use of this?
Thanks for letting me know, Simon. Unfortunately, this probably means I will not be able to use datastructures
as a priority queue for parallel workers.
The work around I've implemented so far is to get all values from the heap and then keep only the relevant items, e.g. using purrr::keep
:
values(heap) %>%
purrr::keep(~ .x$value == whatever you want) %>%
purrr::map_chr(~.x$"handle")
That puts more burden on the user, but will should solve the problem. Once I am done implementing this, I come back to you.
Hi Will,
just forget about my last post. Turns out the C API has a function for comparison of SEXP
s.
So you can use the handle as before.
hands <- handle(fheap, value=list("V1", list(a=2)))
hands
## [[1]]
## [[1]]$handle
## [1] "handle-0"
##
## [[1]]$key
## [1] -1000
...
I still advise to use this cautiously, for non-primitive values, for instance data frames or matrices. I'll release this to CRAN in the next days.
I am really glad to hear it. I was only planning on using primitive values anyway, and I know the struggle of different users pulling in different directions. Thanks for accommodating.