Spark.jl
Spark.jl copied to clipboard
SparkContext giving StackOverflowError
I'm following the basic steps in the tutorial and found some issues to load the SparkContext.
Btw, the Spark.init()
only works if set JULIA_COPY_STACKS=1
. I think is good to clarify to the others in the documentation.
Setup
Apache Maven 3.6.3
Maven home: /usr/share/maven
Java version: 11.0.9.1, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
Default locale: en_US, platform encoding: ANSI_X3.4-1968
OS name: "linux", version: "4.4.0-1112-aws", arch: "amd64", family: "unix"
spark-3.0.1-bin-hadoop3.2
Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 11.0.9.1)
Code
app# JULIA_COPY_STACKS=1 julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.4.1 (2020-04-14)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using Spark
julia> Spark.init()
julia> sc = SparkContext(master="local")
Error
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/11/14 23:43:06 WARN Utils: Your hostname, ip-10-202-48-234 resolves to a loopback address: 127.0.0.1; using 10.202.48.234 instead (on interface ens3)
20/11/14 23:43:06 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/spark/spark-3.0.1-bin-hadoop3.2/jars/spark-unsafe_2.12-3.0.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/11/14 23:43:06 INFO SparkContext: Running Spark version 3.0.1
Exception in thread "process reaper" java.lang.StackOverflowError
at java.base/java.lang.invoke.MethodType$ConcurrentWeakInternSet$WeakEntry.equals(MethodType.java:1341)
at java.base/java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:940)
at java.base/java.lang.invoke.MethodType$ConcurrentWeakInternSet.get(MethodType.java:1279)
at java.base/java.lang.invoke.MethodType.makeImpl(MethodType.java:300)
at java.base/java.lang.invoke.MethodTypeForm.canonicalize(MethodTypeForm.java:355)
at java.base/java.lang.invoke.MethodTypeForm.findForm(MethodTypeForm.java:317)
at java.base/java.lang.invoke.MethodType.makeImpl(MethodType.java:315)
at java.base/java.lang.invoke.MethodType.insertParameterTypes(MethodType.java:410)
at java.base/java.lang.invoke.VarHandle$AccessDescriptor.<init>(VarHandle.java:1853)
at java.base/java.lang.invoke.MethodHandleNatives.varHandleOperationLinkerMethod(MethodHandleNatives.java:518)
at java.base/java.lang.invoke.MethodHandleNatives.linkMethodImpl(MethodHandleNatives.java:462)
at java.base/java.lang.invoke.MethodHandleNatives.linkMethod(MethodHandleNatives.java:450)
at java.base/java.util.concurrent.CompletableFuture.completeValue(CompletableFuture.java:305)
at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2072)
at java.base/java.lang.ProcessHandleImpl$1.run(ProcessHandleImpl.java:162)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
The most likely case is using Spark 3, which we didn't test against yet. Can you try it with Spark 2.4?
Good point @dfdx I changed the version to this one: https://downloads.apache.org/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz
Now I`m getting:
julia> using Spark
julia> Spark.init()
ERROR: ArgumentError: invalid index: nothing of type Nothing
Stacktrace:
[1] to_index(::Nothing) at ./indices.jl:297
[2] to_index(::Array{String,1}, ::Nothing) at ./indices.jl:274
[3] to_indices at ./indices.jl:325 [inlined]
[4] to_indices at ./indices.jl:322 [inlined]
[5] getindex at ./abstractarray.jl:980 [inlined]
[6] load_spark_defaults(::Dict{Any,Any}) at /root/.julia/packages/Spark/3MVGw/src/init.jl:55
[7] init() at /root/.julia/packages/Spark/3MVGw/src/init.jl:5
[8] top-level scope at REPL[2]:1
What is your setup?
I'm using Julia 1.5 and all the default settings, which result in Spark 2.4.7.
How do you set up Spark version? Do you use SPARK_CONF or SPARK_HOME environment variables for this?
I ran into the same problem with StackOverflowException and the problem was solved by switching to Java 8. I think there should be a warning when newer Java is used by accident, it was quite hard to find what caused the problems :sweat_smile:
Closing as outdated.