flambo
flambo copied to clipboard
Getting error on spark job submission
Hello,
I have written a code like -
(def c (-> (conf/spark-conf)
(conf/master "spark://abhi:7077")
(conf/app-name "test")))
(def sc (f/spark-context c))
(let [rdd (f/parallelize sc [[1 2][3 4]])]
(f/collect rdd))
When I am collecting data from rdd, spark job submission happens and I get an error-
java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
This very same code runs perfectly fine when I am setting master as local[*]
. Please help me with this issue, what can be the problem is.
I am using spark 1.6.0
and scala version 2.10
. And I am running a standalone spark with the same version 1.6.0
.
what version of flambo are you using?
Per the README, Spark 1.x requires [yieldbot/flambo "0.7.2"]
yep, I am using "0.7.2" only.
@sorenmacbeth , Do you have any idea about this issue? Issue is like:
org.apache.spark.SparkException: Failed to register classes with Kryo at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:139) at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:292) at org.apache.spark.serializer.KryoSerializerInstance.
(KryoSerializer.scala:277) at org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:186) at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:79) at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply(ParallelCollectionRDD.scala:70) at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply(ParallelCollectionRDD.scala:70) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303) at org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1158) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2278) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2202) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:309) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: flambo.kryo.BaseFlamboRegistrator at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:134) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:134) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:134) ... 27 more 18/01/30 12:18:11 ERROR Executor: Exception in task 0.3 in stage 0.0 (TID 14) java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2773) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1599) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2278) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2202) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:309) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
I am getting same issue with flambo 0.8.2 and spark 2.2.0.
looks like you/re trying to serialize a class with kryo that you haven't registered. this is not really a flambo issue. take a look at the spark documentation for kryo serialization.
On Mon, Jan 29, 2018 at 11:09 PM, Abhishek B Jangid < [email protected]> wrote:
@sorenmacbeth https://github.com/sorenmacbeth , Do you have any idea about this issue? Issue is like:
org.apache.spark.SparkException: Failed to register classes with Kryo at org.apache.spark.serializer.KryoSerializer.newKryo( KryoSerializer.scala:139) at org.apache.spark.serializer.KryoSerializerInstance. borrowKryo(KryoSerializer.scala:292) at org.apache.spark.serializer.KryoSerializerInstance.( KryoSerializer.scala:277) at org.apache.spark.serializer.KryoSerializer.newInstance( KryoSerializer.scala:186) at org.apache.spark.rdd.ParallelCollectionPartition$$ anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:79) at org.apache.spark.rdd.ParallelCollectionPartition$$ anonfun$readObject$1.apply(ParallelCollectionRDD.scala:70) at org.apache.spark.rdd.ParallelCollectionPartition$$ anonfun$readObject$1.apply(ParallelCollectionRDD.scala:70) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303) at org.apache.spark.rdd.ParallelCollectionPartition.readObject( ParallelCollectionRDD.scala:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke( DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1158) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject( ObjectInputStream.java:2060) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) at java.io.ObjectInputStream.defaultReadFields( ObjectInputStream.java:2278) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2202) at java.io.ObjectInputStream.readOrdinaryObject( ObjectInputStream.java:2060) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427) at org.apache.spark.serializer.JavaDeserializationStream. readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance. deserialize(JavaSerializer.scala:114) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:309) at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: flambo.kryo. BaseFlamboRegistrator at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply( KryoSerializer.scala:134) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply( KryoSerializer.scala:134) at scala.collection.TraversableLike$$anonfun$map$ 1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$ 1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized. scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.serializer.KryoSerializer.newKryo( KryoSerializer.scala:134) ... 27 more 18/01/30 12:18:11 ERROR Executor: Exception in task 0.3 in stage 0.0 (TID 14) java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode( ObjectInputStream.java:2773) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1599) at java.io.ObjectInputStream.defaultReadFields( ObjectInputStream.java:2278) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2202) at java.io.ObjectInputStream.readOrdinaryObject( ObjectInputStream.java:2060) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427) at org.apache.spark.serializer.JavaDeserializationStream. readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance. deserialize(JavaSerializer.scala:114) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:309) at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
I am getting same issue with flambo 0.8.2 and spark 2.2.0.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yieldbot/flambo/issues/132#issuecomment-361496998, or mute the thread https://github.com/notifications/unsubscribe-auth/AAH7-9cpaA-VBSEuC3OSz-cCv1cQKHztks5tPsAbgaJpZM4RkO1Y .
@sorenmacbeth So, Do I have to register all the classes used in the code. Such as Byte array
, Boolean array
etc. Actually I have written many applications using flambo-0.8.2 and spark-2x, but never encountered with this error. So I suppose some of the classes are already registered.
As well as, how do I register the classes,
I have tried for some classes using https://github.com/EsotericSoftware/kryo
, but not being able to register them.