spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem

Open farshadsm opened this issue 2 years ago • 23 comments

Hi, I get the following error when I submit my spark job to k8s cluster using Spark Operator.

java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem

I've put my yaml file configs below. I've made sure "hadoop-aws" and "aws-java-sdk" have compatible versions. I was able to successfully run the job for the "pi.py" script that is available in the "/opt/spark/examples/src/main/python/pi.py" path of the given image container of the spark operator. However, when spark in my python script wants to read a CSV file from an AWS S3 bucket, I get the error message shown above. I've tried so many different versions of hadoop-aws, none of them resolved my issue. Could you please help me out?

apiVersion: "sparkoperator.k8s.io/v1beta2" kind: SparkApplication metadata: name: test-spark-hadoop-aws-3.2.3 namespace: default spec: deps: repositories: - https://repo.maven.apache.org/maven2/ packages: - org.apache.hadoop:hadoop-aws:3.2.3 - org.apache.hadoop:hadoop-common:3.2.3 - com.amazonaws:aws-java-sdk:1.11.901 sparkConf: spark.driver.extraJavaOptions: "-Divy.cache.dir=/tmp -Divy.home=/tmp -Dcom.amazonaws.services.s3.enableV4=true" spark.executor.extraJavaOptions: "-Dcom.amazonaws.services.s3.enableV4=true"
hadoopConf: fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider" fs.s3a.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem" fs.s3a.access.key: "my AWS access key" fs.s3a.secret.key: "my AWS secret key" fs.s3a.endpoint: "s3.us-east-1.amazonaws.com"
type: Python pythonVersion: "3" mode: cluster image: "gcr.io/spark-operator/spark-py:v3.1.1-hadoop3" imagePullPolicy: Always mainApplicationFile: s3a://mybucket/script.py arguments: ['s3a://mybucket/test.csv'] sparkVersion: "3.1.1" restartPolicy: type: OnFailure onFailureRetries: 3 driver: cores: 1 coreLimit: "1200m" memory: "4G" labels: version: 3.1.1 serviceAccount: my-serviceAccount executor: cores: 1 instances: 1 memory: "4G" labels: version: 3.1.1

farshadsm avatar Apr 09 '22 21:04 farshadsm

Following on my previous comment, I should mention that the kubectl logs show the following download statistics:

---------------------------------------------------------------------
|                  |            modules            ||   artifacts   |
|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
|      default     |  253  |  251  |  251  |   2   ||  251  |  251  |
---------------------------------------------------------------------

It also shows the following lines:

:: problems summary :: :::: ERRORS SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-main/3.2.3/hadoop-main-3.2.3.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-project/3.2.3/hadoop-project-3.2.3.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/amazonaws/aws-java-sdk-pom/1.11.901/aws-java-sdk-pom-1.11.901.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/amazonaws/aws-java-sdk/1.11.901/aws-java-sdk-1.11.901-sources.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/amazonaws/aws-java-sdk/1.11.901/aws-java-sdk-1.11.901-src.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/amazonaws/aws-java-sdk/1.11.901/aws-java-sdk-1.11.901-javadoc.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/apache/13/apache-13.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/commons/commons-parent/28/commons-parent-28.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/apache/21/apache-21.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/httpcomponents/httpcomponents-parent/11/httpcomponents-parent-11.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/httpcomponents/httpcomponents-client/4.5.13/httpcomponents-client-4.5.13.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/httpcomponents/httpcomponents-core/4.4.13/httpcomponents-core-4.4.13.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/apache/18/apache-18.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/apache/commons/commons-parent/42/commons-parent-42.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/fasterxml/oss-parent/24/oss-parent-24.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/fasterxml/jackson/jackson-parent/2.6.2/jackson-parent-2.6.2.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/fasterxml/jackson/core/jackson-databind/2.6.7.3/jackson-databind-2.6.7.3-javadoc.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/fasterxml/oss-parent/23/oss-parent-23.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/fasterxml/jackson/jackson-parent/2.6.1/jackson-parent-2.6.1.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/org/sonatype/oss/oss-parent/9/oss-parent-9.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/io/netty/netty-parent/4.1.48.Final/netty-parent-4.1.48.Final.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/amazonaws/aws-java-sdk-models/1.11.901/aws-java-sdk-models-1.11.901-javadoc.jar

SERVER ERROR: Bad Gateway url=https://dl.bintray.com/spark-packages/maven/com/amazonaws/aws-java-sdk-pom/1.11.22/aws-java-sdk-pom-1.11.22.jar

:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS :: retrieving :: org.apache.spark#spark-submit-parent-0d835d60-447d-4a98-941b-e22aaa69903c confs: [default] 251 artifacts copied, 0 already retrieved (374746kB/445ms) 22/04/09 21:34:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 22/04/09 21:34:56 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties

farshadsm avatar Apr 09 '22 21:04 farshadsm

This might be related?

JFrog to Shut down JCenter and Bintray https://www.infoq.com/news/2021/02/jfrog-jcenter-bintray-closure/

jkleckner avatar Apr 09 '22 23:04 jkleckner

Thanks for pointing me to the link for Bintray. What should I put into my yaml file to not downloading any thing from JCenter? In my yaml file, I set "https://repo.maven.apache.org/maven2/" for ".spec.deps.repositories". I was hoping with this setting, no reference to Bintray would be made.

farshadsm avatar Apr 10 '22 00:04 farshadsm

Have you solved this problem yet? I met A similar problem with you. I can find Class org.apache.hadoop.fs.s3a.S3AFileSystem now, but I can't find java.lang.ClassNotFoundException: org.apache.hadoop.fs.StreamCapabilities

Zhang-Aoqi avatar Apr 12 '22 12:04 Zhang-Aoqi

@Zhang-Aoqi Same Issue here. and i figured it out finally! that error does not come from your spark app. it's happend with your spark-operator pod. In my case, my spark app depends on hadoop 3.2 version, but spark-operator pod which i installed using helm has hadoop 2.7 jar files. please check if yours fine :)

hyungryuk avatar Apr 13 '22 09:04 hyungryuk

@Zhang-Aoqi Same Issue here. and i figured it out finally! that error does not come from your spark app. it's happend with your spark-operator pod. In my case, my spark app depends on hadoop 3.2 version, but spark-operator pod which i installed using helm has hadoop 2.7 jar files.

Oh, I just solved this problem too. I re-swapped the spark version running in the pod. It seems that we are doing similar tasks, if you encounter any problems, welcome to communicate.

Zhang-Aoqi avatar Apr 13 '22 09:04 Zhang-Aoqi

@Zhang-Aoqi Good to know! Thanx. So Did you change your spark version?

hyungryuk avatar Apr 13 '22 09:04 hyungryuk

@hyungryuk I used Spark-3.0.0-bin-hadoop-3.2 as the image in pod, and added the aws-java-sdk-bundle-1.11.375.jar, hadoop-aws-3.2.0.jar. I'm a newbie. I don't know if you can understand me.

Zhang-Aoqi avatar Apr 13 '22 09:04 Zhang-Aoqi

@Zhang-Aoqi Alright! if this error happens again, then try this image to build spark-operator gcr.io/spark-operator/spark:v3.1.1-hadoop3

kind of simmilar issue here : https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1334

hyungryuk avatar Apr 13 '22 10:04 hyungryuk

@hyungryuk Thanks for providing your comments. I'll check with our engineer who set up the Kubernetes cluster and installed the spark pod and I'll let you everyone in this thread about the result. I hope your solution resolves my issue.

farshadsm avatar Apr 13 '22 16:04 farshadsm

@Zhang-Aoqi Thanks for your comments. I'll work on the resolutions suggested by @hyungryuk and will let you know the outcome.

farshadsm avatar Apr 13 '22 16:04 farshadsm

@hyungryuk I've noticed that we have used "spark-operator-chart-1.1.19" helm chart to install spark-operator on the k8s cluster. How can I find the version of hadoop jar files that are created by this helm chart?

farshadsm avatar Apr 13 '22 20:04 farshadsm

@farshadsm just access to your spark-operator pod and run following command :) ls /opt/spark/jars | grep hadoop

hyungryuk avatar Apr 14 '22 00:04 hyungryuk

@Zhang-Aoqi Alright! if this error happens again, then try this image to build spark-operator gcr.io/spark-operator/spark:v3.1.1-hadoop3

kind of simmilar issue here : #1334

Ok, thank you

Zhang-Aoqi avatar Apr 14 '22 00:04 Zhang-Aoqi

I met same error.

My solution is:

I build our own Spark image base on spark-3.2.1-bin-hadoop3.2.tgz

then add these jars under $SPARK_HOME/jars/

At present, it is still in the test stage and can work. No problems have been found for the time being.

Hope it helps

allenhaozi avatar May 12 '22 12:05 allenhaozi

Hi would you please mind sharing the docker image. I tried copying the Jars as suggested, but I still encounter the issue :(

upMKuhn avatar Jun 05 '22 23:06 upMKuhn

I created a public image :) registry.gitlab.com/upmkuhn/spark-operator:v3-2-hadoop3-aws-2

upMKuhn avatar Jun 06 '22 16:06 upMKuhn

@upMKuhn allenhaozi/base-pyspark-3.2.1-py-v3.8:v0.1.0

allenhaozi avatar Jun 07 '22 06:06 allenhaozi

@upMKuhn allenhaozi/base-pyspark-3.2.1-py-v3.8:v0.1.0

测试可以的,我自己也打了个镜像,比较重。能参考下你的Dockerfile么,感谢大佬。

Xxxxxyd avatar Jun 13 '22 10:06 Xxxxxyd

@Xxxxxyd https://github.com/Wh1isper/spark-build Here is my dockerfile and docs try: wh1isper/spark-executor:3.4.1

I am developing sparglim as tools to config pyspark quickly and friendly for pyspark-based app see: https://github.com/Wh1isper/sparglim

Wh1isper avatar Jul 21 '23 06:07 Wh1isper

@Zhang-Aoqi Good to know! Thanx. So Did you change your spark version?

i'm having the same error but i couldn't resolve it, could you help please ?

aimendenche avatar Aug 02 '23 12:08 aimendenche

SICKER THAN YOUR BABA

On Wed, 2 Aug 2023, 13:48 aimendenche, @.***> wrote:

@Zhang-Aoqi https://github.com/Zhang-Aoqi Good to know! Thanx. So Did you change your spark version?

i'm having the same error but i couldn't resolve it, could you help please ?

— Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1508#issuecomment-1662151433, or unsubscribe https://github.com/notifications/unsubscribe-auth/A435L3BGJZ5F3PHRTZX3RTDXTJECZANCNFSM5S727SYA . You are receiving this because you are subscribed to this thread.Message ID: <GoogleCloudPlatform/spark-on-k8s-operator/issues/1508/1662151433@ github.com>

staceystaceybangbang007 avatar Aug 02 '23 14:08 staceystaceybangbang007

I had same problems (org.apache.hadoop.fs.s3a.S3AFileSystem not found). When I tried:

      deps:
        files:
          - "s3a://k8s-3c172e28d7da2e-bucket/test.jar"

Even added jars files inside image: "image-registry/spark-base-image" did not work. But I fixed this problem when I added necessary jars inside SPARK-OPERATOR pod. You can rebuild you Docker image by adding jars. I rebuild it:

FROM ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.7-3.1.1

ENV SPARK_HOME /opt/spark

RUN curl https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.4/hadoop-aws-2.7.4.jar -o ${SPARK_HOME}/jars/hadoop-aws-2.7.4.jar
RUN curl https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar -o ${SPARK_HOME}/jars/aws-java-sdk-1.7.4.jar

In spark-operator inside has hadoop version 2.7 and we need use all dependencies exactly for this version on https://mvnrepository.com/

First for tests I went to inside spark-operator pod by command

kubectl exec -it spark-operator-fb8f779cb-gt657 -n spark-operator -- bash

Then inside my spark-operator pod I go to /opt/spark/jars and upload jars (for example curl https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.4/hadoop-aws-2.7.4.jar)

Then I tried apply my manifest with deps.files and it is worked.

ViktorGlushak avatar Aug 25 '23 14:08 ViktorGlushak