spark-operator
spark-operator copied to clipboard
Sparkapplication stuck forever?
Just had a spark application that connects to some streaming service and consumes data, but the sparkapplication is stuck without sate for a too long time?
NAME STATUS ATTEMPTS START FINISH AGE
**redacted** 5m16s
When checking the driver logs, all I see:
I1108 08:51:43.924523 10 controller.go:184] SparkApplication **readacted**/**redacted** was added, enqueuing it for submission
No pod is being created, other than the operator one and I am completely blind here, how can I debug this?
Thanks
Edit: After a while it crashed but the message error is just showing warnings?
failed to run spark-submit for SparkApplication **redacted**/**redacted**:
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/spark/jars/spark-unsafe_2.12-3.1.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
https://repo1.maven.org/ added as a remote repository with the name: repo-1
Ivy Default Cache set to: /root/.ivy2/cache
The jars for the packages stored in: /root/.ivy2/jars
com.microsoft.azure#azure-eventhubs-spark_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-a29b89a6-57af-47bc-a8c0-3e49d685c8f7;1.0
confs: [default]
found com.microsoft.azure#azure-eventhubs-spark_2.12;2.3.22 in central
found com.microsoft.azure#azure-eventhubs;3.3.0 in central
found org.apache.qpid#proton-j;0.33.8 in central
found com.microsoft.azure#qpid-proton-j-extensions;1.2.4 in central
found org.slf4j#slf4j-api;1.7.30 in central
found com.microsoft.azure#azure-client-authentication;1.7.3 in central
found com.microsoft.azure#azure-client-runtime;1.7.3 in central
found com.microsoft.rest#client-runtime;1.7.3 in central
found com.google.guava#guava;24.1.1-jre in central
found com.google.code.findbugs#jsr305;1.3.9 in central
found org.checkerframework#checker-compat-qual;2.0.0 in central
found com.google.errorprone#error_prone_annotations;2.1.3 in central
found com.google.j2objc#j2objc-annotations;1.1 in central
found org.codehaus.mojo#animal-sniffer-annotations;1.14 in central
found com.squareup.retrofit2#retrofit;2.7.2 in central
found com.squareup.okhttp3#okhttp;3.12.6 in central
found com.squareup.okio#okio;1.15.0 in central
found com.squareup.okhttp3#logging-interceptor;3.12.2 in central
found com.squareup.okhttp3#okhttp-urlconnection;3.12.2 in central
found com.squareup.retrofit2#converter-jackson;2.7.2 in central
found com.fasterxml.jackson.core#jackson-databind;2.10.1 in central
found com.fasterxml.jackson.core#jackson-annotations;2.10.1 in central
found com.fasterxml.jackson.core#jackson-core;2.10.1 in central
found com.fasterxml.jackson.datatype#jackson-datatype-joda;2.10.1 in central
found joda-time#joda-time;2.9.9 in central
found org.apache.commons#commons-lang3;3.4 in central
found io.reactivex#rxjava;1.3.8 in central
found com.squareup.retrofit2#adapter-rxjava;2.7.2 in central
found com.microsoft.azure#azure-annotations;1.10.0 in central
found commons-codec#commons-codec;1.11 in central
found com.microsoft.azure#adal4j;1.6.4 in central
found com.nimbusds#oauth2-oidc-sdk;6.5 in central
found com.sun.mail#javax.mail;1.6.1 in central
found javax.activation#activation;1.1 in central
found com.github.stephenc.jcip#jcip-annotations;1.0-1 in central
found net.minidev#json-smart;2.3 in central
[2.3] net.minidev#json-smart;[1.3.1,2.3]
found net.minidev#accessors-smart;1.2 in central
found org.ow2.asm#asm;5.0.4 in central
found com.nimbusds#lang-tag;1.7 in central
[1.7] com.nimbusds#lang-tag;[1.4.3,)
found com.google.code.gson#gson;2.8.0 in central
found com.nimbusds#nimbus-jose-jwt;9.8.1 in central
found org.scala-lang.modules#scala-java8-compat_2.12;0.9.0 in central
:: resolution report :: resolve 44200ms :: artifacts dl 2200ms
:: modules in use:
com.fasterxml.jackson.core#jackson-annotations;2.10.1 from central in [default]
com.fasterxml.jackson.core#jackson-core;2.10.1 from central in [default]
com.fasterxml.jackson.core#jackson-databind;2.10.1 from central in [default]
com.fasterxml.jackson.datatype#jackson-datatype-joda;2.10.1 from central in [default]
com.github.stephenc.jcip#jcip-annotations;1.0-1 from central in [default]
com.google.code.findbugs#jsr305;1.3.9 from central in [default]
com.google.code.gson#gson;2.8.0 from central in [default]
com.google.errorprone#error_prone_annotations;2.1.3 from central in [default]
com.google.guava#guava;24.1.1-jre from central in [default]
com.google.j2objc#j2objc-annotations;1.1 from central in [default]
com.microsoft.azure#adal4j;1.6.4 from central in [default]
com.microsoft.azure#azure-annotations;1.10.0 from central in [default]
com.microsoft.azure#azure-client-authentication;1.7.3 from central in [default]
com.microsoft.azure#azure-client-runtime;1.7.3 from central in [default]
com.microsoft.azure#azure-eventhubs;3.3.0 from central in [default]
com.microsoft.azure#azure-eventhubs-spark_2.12;2.3.22 from central in [default]
com.microsoft.azure#qpid-proton-j-extensions;1.2.4 from central in [default]
com.microsoft.rest#client-runtime;1.7.3 from central in [default]
com.nimbusds#lang-tag;1.7 from central in [default]
com.nimbusds#nimbus-jose-jwt;9.8.1 from central in [default]
com.nimbusds#oauth2-oidc-sdk;6.5 from central in [default]
com.squareup.okhttp3#logging-interceptor;3.12.2 from central in [default]
com.squareup.okhttp3#okhttp;3.12.6 from central in [default]
com.squareup.okhttp3#okhttp-urlconnection;3.12.2 from central in [default]
com.squareup.okio#okio;1.15.0 from central in [default]
com.squareup.retrofit2#adapter-rxjava;2.7.2 from central in [default]
com.squareup.retrofit2#converter-jackson;2.7.2 from central in [default]
com.squareup.retrofit2#retrofit;2.7.2 from central in [default]
com.sun.mail#javax.mail;1.6.1 from central in [default]
commons-codec#commons-codec;1.11 from central in [default]
io.reactivex#rxjava;1.3.8 from central in [default]
javax.activation#activation;1.1 from central in [default]
joda-time#joda-time;2.9.9 from central in [default]
net.minidev#accessors-smart;1.2 from central in [default]
net.minidev#json-smart;2.3 from central in [default]
org.apache.commons#commons-lang3;3.4 from central in [default]
org.apache.qpid#proton-j;0.33.8 from central in [default]
org.checkerframework#checker-compat-qual;2.0.0 from central in [default]
org.codehaus.mojo#animal-sniffer-annotations;1.14 from central in [default]
org.ow2.asm#asm;5.0.4 from central in [default]
org.scala-lang.modules#scala-java8-compat_2.12;0.9.0 from central in [default]
org.slf4j#slf4j-api;1.7.30 from central in [default]
:: evicted modules:
org.slf4j#slf4j-api;1.7.28 by [org.slf4j#slf4j-api;1.7.30] in [default]
org.slf4j#slf4j-api;1.7.22 by [org.slf4j#slf4j-api;1.7.30] in [default]
com.nimbusds#nimbus-jose-jwt;[6.0.1,) by [com.nimbusds#nimbus-jose-jwt;9.8.1] in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 45 | 2 | 0 | 3 || 42 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-a29b89a6-57af-47bc-a8c0-3e49d685c8f7
confs: [default]
0 artifacts copied, 42 already retrieved (0kB/600ms)
23/11/08 07:55:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
23/11/08 07:55:10 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
23/11/08 07:55:19 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
23/11/08 07:55:21 WARN DriverCommandFeatureStep: spark.kubernetes.pyspark.pythonVersion was deprecated in Spark 3.1. Please set 'spark.pyspark.python' and 'spark.pyspark.driver.python' configurations or PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON environment variables instead.
I have the same issue on kubernetes v28.
I have faced the same problem. In my case, it seems likely that there're several SparkApplication having same name are submitted in same time. You should check the spark operator pods' logs for more information.