containers
containers copied to clipboard
[bitnami/spark] generate bitnami/spark with python 3.10.x
Name and Version
bitnami/spark:3.2.2
What is the problem this feature will solve?
Added support to use python 3.10.x - pyspark driver 3.10.
What is the feature you are proposing to solve the problem?
Without python 3.10.x the following error is raised: "Python in worker has different version 3.8 than that in driver 3.10".
What alternatives have you considered?
First, build my own docker image.
Or, generate a new bitname/spark support to python 3.10.x. For this it's just necessary add the following line in the Dockerfile:
# current version 3.2
RUN . /opt/bitnami/scripts/libcomponent.sh && component_unpack "python" "3.8.13-166" --checksum 9a5fba755f6c8d60eacc80f366f3fbaa57d003913e48c31ba337037bb69e37b3
# proposal for the new tag
RUN . /opt/bitnami/scripts/libcomponent.sh && component_unpack "python" "3.10.6-7" --checksum 02e5a66908664141ad80a1d40b390a71e8cec13771cb38600c322d36747fb298
Hi,
Without python 3.10.x the following error is raised: "Python in worker has different version 3.8 than that in driver 3.10".
Could you specify the steps that make this message appear?
I'm facing the same issue when spark-submitt'ing a job from a recent python 3.10 environment (developer's laptop) to spark running inside this container
Could you provide an environment for us to consistently reproducing the issue?
Could you provide an environment for us to consistently reproducing the issue?
conda create -n spark310 python=3.10 && conda install pyspark && spark-submit …
EDIT: I'm using spark-submit from https://www.apache.org/dyn/closer.lua/spark/spark-3.3.0/spark-3.3.0-bin-hadoop3.tgz instead of the conda installed spark-submit
my complete command is
./bin/spark-submit --master spark://host-with-bitnami-spark:7077 --conf "spark.driver.extraJavaOptions=--add-exports java.base/jdk.internal.misc=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED" ~/Dev/my-project/my-spark-job.py
Ok,
I will create a task for updating the python version to 3.10. I cannot guarantee an ETA but as soon as there are news, I will update the ticket
@javsalgar any news on this? I also get this error
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.
Request to reopen, Python 3.8 is really outdated now, and building a 3.10 image ourselves is a considerable operational burden
Please note at this moment the Python version used in the Bitnami Spark container is 3.9, no 3.8.
About using 3.10, we found some issues in some of the distros supported as part of the VMware Application Catalog (Debian 11, CentOS 7, PhotonOS 3 & 4, Ubuntu 18.04, 20.04 & 22.04, RedHat UBI 8 & 9). We'll review if it's possible to bump the version but note none of the Python versions (3.8, 3.9 or 3.10) are close to reaching the EOL.