spark
spark copied to clipboard
Running job in DeployMode = Cluster in GCP
I'm running Spark job written in F# in GCP DataProc, and everything works great.
By default, GCP does not gather logs from the Job Output and I can find logs only in GCP console, which is not acceptable.
GCP recommendation is to run the job in deployMode=cluster so my output will be logged and available in GCP Logging. And this works - I can see my output in Logging.
The problem is that my job fails whenever is executed in Cluster mode with this error:
java.lang.IllegalStateException: User did not initialize spark context!
How can I troubleshoot this?
I use this code to initilize the session:
let createSession () =
SparkSession
.Builder()
.AppName(Assembly.GetExecutingAssembly().GetName().Name)
.Config("spark.hadoop.google.cloud.auth.service.account.enable", "true")
.Config("spark.hadoop.google.cloud.auth.service.account.json.keyfile", <key>)
.Config("fs.gs.impl", "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem")
.EnableHiveSupport()
.GetOrCreate()
Full error stactrac:
21/03/15 16:19:41 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at custom-qa-m/xx.xx.xx.xx:8032
21/03/15 16:19:41 INFO org.apache.hadoop.yarn.client.AHSProxy: Connecting to Application History server at custom-qa-m/xx.xx.xx.x:10200
21/03/15 16:19:46 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1615825101913_0001
21/03/15 16:20:18 ERROR org.apache.spark.deploy.yarn.Client: Application diagnostics message: Application application_1615825101913_0001 failed 2 times due to AM Container for appattempt_1615825101913_0001_000002 exited with exitCode: 13
Failing this attempt.Diagnostics: [2021-03-15 16:20:17.254]Exception from container-launch.
Container id: container_1615825101913_0001_02_000001
Exit code: 13
[2021-03-15 16:20:17.260]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
21/03/15 16:20:16 ERROR org.apache.spark.deploy.yarn.ApplicationMaster: Uncaught exception:
java.lang.IllegalStateException: User did not initialize spark context!
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:486)
at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:305)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:245)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:781)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:780)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:244)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:805)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
[2021-03-15 16:20:17.261]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
21/03/15 16:20:16 ERROR org.apache.spark.deploy.yarn.ApplicationMaster: Uncaught exception:
java.lang.IllegalStateException: User did not initialize spark context!
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:486)
at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:305)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:245)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:781)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:780)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:244)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:805)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
For more detailed output, check the application tracking page: http://custom-qa-m:8188/applicationhistory/app/application_1615825101913_0001 Then click on links to logs of each attempt.
. Failing the application.
Exception in thread "main" org.apache.spark.SparkException: Application application_1615825101913_0001 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1151)
at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1528)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Can you check if createSession is being called in your application? (I assume it's a function?)