coffee_boat icon indicating copy to clipboard operation
coffee_boat copied to clipboard

☕⛵WIP PySpark dependency management

Results 21 coffee_boat issues
Sort by recently updated
recently updated
newest added

https://github.com/pantsbuild/pex

For now we leave it out since many providers won't support it right now. Long story, buy me a :coffee: .

Right now we do some terrible things with overriding the PYTHON_PATH, which is great and works in the general case. If the Spark+K8 folks end up integrating better first party...

We currently have one example notebook, would be good to update the example to distribute PyArrow since this will be useful in Spark 2.3+ for vectorized UDF users.

help wanted
good first issue

In theory most of what we do is with add files in Spark which should be handled, but the decompressed directory I'm less certain about. We should investigate this.

Write now we create a bunch of temp files but don't really clean them up. There is a flag to do part of this but it needs to be tested...

help wanted