gregleleu comments

Results 41 comments of


                                            gregleleu

Collect UDT with arrow enabled

It doesn't happen anymore with Spark 3.4.0, and now the column is collected as a binary column (not a jobj pointer). (Still the same on 3.3.1)

Warnings with binary files

I'm on a mac yes, but it also shows up in the Sedona CI: https://github.com/apache/sedona/actions/runs/5603748794/jobs/10250817504 (section "run tests"). I don't get an error either when I run the sparklyr tests,...

Warnings with binary files

We are now getting this issue even on mutates and other verbs in the Sedona tests. dbGetQuery doesn't generate this, so I'm pretty sure the warning comes from some operation...

Warnings with binary files

Some more research: It seems to come from the now "by" in dplyr. The warning gets generated by a call to this function in tidyselect `tidyselect_data_proxy` dbplyr has a `tidyselect_data_proxy.tbl_lazy`...

Warnings with binary files

Found it! It's the call to tidyselect_data_proxy.tbl_spark and subsequent call to simulate_vars_spark it's here: https://github.com/sparklyr/sparklyr/blob/22aa571c1cb2820b916fff9e0860647c2ea024f5/R/utils.R#L475 Submitting a PR

Install instructions Sedona/R for AWS EMR?

The "hard" part is setting up EMR with R, a few resources: * https://spark.rstudio.com/deployment/yarn-cluster-emr * https://aws.amazon.com/blogs/big-data/running-sparklyr-rstudios-r-interface-to-spark-on-amazon-emr/ Then you just need to make sure the cluster has the sedona jars. One...

Rectangles' borders in ggmosaic

linewidth should be the parameter you're looking for

Investigate using vctrs for plumbing

Any chance you're still looking at this? the package is using deprecated functions from rlang

not resolved from current namespace (catboost)

Hey, When you install it using `devtools::install_github('catboost/catboost', subdir = 'catboost/R-package')`, the package gets the latest R code from github, but does not compile the C/C++ functions, it downloads the precompiled...

not resolved from current namespace (catboost)

@ckiefer I've checked, the 0.24.2 release for Linux (here: https://github.com/catboost/catboost/releases/download/v0.24.2/catboost-R-Linux-0.24.2.tgz) does not have calls to the new functions (e.g. `catboost.get_plain_params` in `catboost.train`). Are you sure your installation worked? Try running...