pathling icon indicating copy to clipboard operation
pathling copied to clipboard

Implement file_name_mappers in R sparkly API

Open piotrszul opened this issue 2 years ago • 0 comments

Functions such as ptl_read_ndjson should support file_name_mapper to allow flexible mapping of filenames to resource types.

sparkly java interface however does not provide any standard mechanism for R callback functions (i.e. calling back R code from JVM) in the same manner they are supported by Py4J.

It might be possible to adapt the code used in spark_apply() although this may require digging deep into sparklyr implementation and may make it very coupled with this implementation.

A better approach may be to implement an explicit mapper in Java, that explicitly maps all the files names to it's resources and then construct it in R using an R lambda. That would also require an interface from R to list all the files a directory described by spark supported filesystem URL.

piotrszul avatar Aug 09 '23 00:08 piotrszul