multidplyr
multidplyr copied to clipboard
A dplyr backend that partitions a data frame over multiple processes
`qs` comes with a lot of native dependencies that can be a bit of a challenge to manage (e.g. 5 libraries in this package and PCRE2 in stringfish). Since the...
I'm getting an error when I try to use "drop_last" in `summarize()` with `multidplyr`; here's a reprex: ``` r library(nycflights13) library(multidplyr) cluster #> Attaching package: 'dplyr' #> The following objects...
It seems multiplyr is not honoring GROUP BY semantics as seen on reprex below. ``` r library(tidyverse) library(multidplyr) data % group_by(group) %>% summarise(sum = sum(int), n = n()) #> #...
`unnest_longer()`, function of tidyr, fails with what seems a red herring error message. ``` r library(tidyverse) library(multidplyr) data % unnest_longer(col = list_col, values_to = "unlisted") #> # A tibble: 4...
cluster_copy() returns the following rlang error: ```r cluster
I would like to wrap multidplyr code into a function, since I am applying the parallelised multidplyr code to multiple datasets. However, this significantly slows the processing speed, to a...
As per title: ```R cluster % multidplyr::partition(cluster) %>% dplyr::summarise(avg_mpg = mean(mpg)) %>% dplyr::collect() } someFunc(mtcars, cluster) ``` If you are lucky 2 or more cores (main+workers) will randomly get used,...
The `{multidplyr}` package changes class of object distributed to workers to `multidplyr_party_df`. This causes a loss of the "special sauce" that is provided by the `{sf}` package for spatial datasets...
Tested (testthat) with R 3.3.3 on windows.
ReadMe and other multidplyr tutorials created by the authors do not mention how to close clusters after these have been initiated. I apologise, as I'm probably not using the package...