spack
spack copied to clipboard
Update py-pyspark and py-py4j
- Update versions for py-spark
- Add a default variant of py4j that enforces a requirement of Java (disable to use system java). When this isn't present py4j will fail to initialize if Java is not present (or is incompatible with spark)
- Add a variant and dependencies to pyspark to require java via py4j
- Make the explicit py4j version and py-spark version tracking easier to read/update.
- Adds dependencies as noted on pyspark dependencies page: pyarrow, pandas, numpy, grpcio, grpcio-status, and googleapis-common-proto
- Adds new versions of pyarrow and arrow (for gcc 14 compatibility)
- Adds version bumps to 3 packages to meet expectations from spark: grpcio, grpcio-status, and googleapis-common-proto
- Added new protobuf versions (for grpcio-status dependency) and gcc 14 compatibility conflicts
Not included (but probably should be): allow py-spark to be built from source (or provided as a virtual package from spark built from source).
I'm not sure about the best way to set defaults for py4j & java.