dr-elephant spark 2.X could not be analyzed?

i build project compile.conf hadoop_version=2.6.0 spark_version=1.6.0

i run my spark app in spark 2.X
now there is no error messages in dr_elephant.log but in my dr-elephant web qq 20170411150033

so. i need compile the project with hadoop2.X spark 2.X ? but it will encounter another build problems.what should i do ?

Apr 11 '17 07:04 hereTac

I change my compile.conf and specify spark_version=2.1.1 , but it can't build successfully . Does Dr.Elephant not support Spark 2.X ?

May 27 '17 07:05 wtx626

Dr Elephant does not support Spark 2.X

Jun 06 '17 19:06 shkhrgpt

@shkhrgpt do you mean the fetcher is not able to parse it or Dr. E itself would not build with 2.X?

Jun 06 '17 23:06 superbobry

Dr Elephant would not build with Spark 2.X after SparkFSFetcher is added because of Spark listener classes. The workaround would to build with Spark 1.X and use it against Spark 2.X but that would only work as long Spark REST fetcher is used.

Jun 07 '17 01:06 shkhrgpt

I see, thanks for clarifying this.

Jun 07 '17 10:06 superbobry

@akshayrai why not we get another branch to support dr.elephant with spark 2.X?

Jun 22 '17 15:06 BruceXu1991

@BruceXu1991 I don't think we need a separate branch for Spark 2.X. Even in the same branch, we should add Spark 2.X support without breaking Spark 1.X.

Jun 22 '17 16:06 shkhrgpt

agree with you, we should add Spark 2.X support instead of have a separate branch for Spark 2.X

Jun 23 '17 02:06 BruceXu1991

Does anyone work on this?

Aug 04 '17 08:08 LantaoJin

The guys at pepperdata said they would contribute this back, but no sign of it yet.

Sep 04 '17 16:09 jimdowling

I am here to modify the support spark2.0

Sep 08 '17 06:09 shandowsoftware

I tried Spark REST fetcher with Spark 2.2, getting the following exception and the metrics have zero value:

[error] o.a.s.s.ReplayListenerBus - Exception parsing Spark event log: application_1510469066221_0020 org.json4s.package$MappingException: Did not find value which can be converted into boolean at org.json4s.reflect.package$.fail(package.scala:96) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.convert(Extraction.scala:554) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.extract(Extraction.scala:331) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.extract(Extraction.scala:42) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:21) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.apache.spark.util.JsonProtocol$.storageLevelFromJson(JsonProtocol.scala:881) ~[org.apache.spark.spark-core_2.10-1.6.3.jar:1.6.3] [error] o.a.s.s.ReplayListenerBus - Malformed line #37: {"Event":"SparkListenerJobStart","Job ID":0,"Submission Time":1511486140513,"Stage Infos":[{"Stage ID":0,"Stage Attempt ID":0,"Stage Name":"collect at /home/hadoop/classify_image_pyspark_emr.py:185","Number of Tasks":512,"RDD Info":[{"RDD ID":1,"Name":"PythonRDD","Callsite":"collect at /home/hadoop/classify_image_pyspark_emr.py:185","Parent IDs":[0],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":512,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":0,"Name":"ParallelCollectionRDD","Scope":"{"id":"0","name":"parallelize"}","Callsite":"parallelize at PythonRDD.scala:480","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":512,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.rdd.RDD.collect(RDD.scala:935)\norg.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:458)\norg.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)\nsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\nsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\nsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\njava.lang.reflect.Method.invoke(Method.java:498)\npy4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\npy4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\npy4j.Gateway.invoke(Gateway.java:280)\npy4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\npy4j.commands.CallCommand.execute(CallCommand.java:79)\npy4j.GatewayConnection.run(GatewayConnection.java:214)\njava.lang.Thread.run(Thread.java:748)","Accumulables":[]}],"Stage IDs":[0],"Properties":{"spark.rdd.scope.noOverride":"true","callSite.short":"collect at /home/hadoop/classify_image_pyspark_emr.py:185","spark.rdd.scope":"{"id":"1","name":"collect"}"}}

Nov 24 '17 01:11 tomz

anybody here analyzing spark2.X job in drelephant . How spark2.X SparkListener blocking here for reading event and analysis.

Jan 12 '18 08:01 rajeshcode

Please refer #327 for the updates.

Feb 01 '18 05:02 akshayrai

let me look your app-conf/FetcherConf.xml file

Oct 16 '18 04:10 xluckly

I am also facing same issue. Compiled Dr Elephant with Spark 1.X and analyzing spark 2.X jobs.

Nov 17 '18 15:11 ankurchourasiya

I had solved this problem and I can push git

At 2018-11-17 23:32:00, "ankurchourasiya" [email protected] wrote:

I am also fetching same issue. Compiled Dr Elephant with Spark 1.X and analyzing spark 2.X jobs.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

Nov 30 '18 08:11 xluckly

I solved this problem

Nov 30 '18 08:11 xluckly

my address https://github.com/Hanqingkuo/dr.elephant-spark2.x

Nov 30 '18 08:11 xluckly

It will be nice if u pull a request here. @Hanqingkuo

Nov 30 '18 08:11 hereTac

@Hanqingkuo Thanks for sharing code. I could compile dr elephant and now able to see jobs in UI. Though Malformed record error is still there in logs also Heuristics seems to be not captures correctly. Running on Spark 2.3. Have you also faced such issue: (Screen capture)

Dec 03 '18 10:12 ankurchourasiya

可能是没有收集算法吧需要你自己写

At 2018-12-03 18:42:08, "ankurchourasiya" [email protected] wrote:

@Hanqingkuo Thanks for sharing code. I could compile dr elephant and now able to see jobs in UI. Though Malformed record error is still there in logs also Heuristics seems to be not captures correctly. Running on Spark 2.3. Have you also faced such issue: (Screen capture)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Dec 12 '18 06:12 xluckly

dr-elephant dr-elephant copied to clipboard

spark 2.X could not be analyzed?

dr-elephant
dr-elephant copied to clipboard