dr-elephant icon indicating copy to clipboard operation
dr-elephant copied to clipboard

Spark 2.3 is now out... Will Dr.elephant support it out of the box?

Open zauggerr opened this issue 7 years ago • 5 comments

https://spark.apache.org/releases/spark-release-2-3-0.html

So ive been reading about how linkedin has been using a modified SHS to pull from 1.X 2.X...

But if you were to install Spark 2.3, will DR.elephant just work with it out of the box?

Has anyone tried Spark 2.3 and DR elephant yet?

zauggerr avatar Mar 08 '18 14:03 zauggerr

I think Spark 2.3 is a requirement to analyze Spark 2.x jobs, as per https://github.com/linkedin/dr-elephant/issues/327 .. or at least you would need to build a custom SHS as it was a prerequisite - see https://issues.apache.org/jira/browse/SPARK-18085 for details.

Tagar avatar Mar 12 '18 18:03 Tagar

SPARK-18085 brings the LevelDB storage for Spark History Server(SHS), this would help Dr. Elephant to gather metrics as it improves SHS overall performance. If you have some former versions of SHS which kept all data in-memory, Dr. Elephant can still gather metrics from the former version of SHS, as the Rest Fetcher in Dr. Elephant calls the same RestAPIs. If you don't have a large amount of applications per day, former version SHS might still work for you.

But since we added new metrics in Spark(code change in Executor, Driver, SHS as described in ticket SPARK-23206), you might not be able to get some new metrics that we added. The reason that you need a custom SHS is our PR for these new metrics is not yet getting merged. We are targeting Spark 2.3.1 and Spark 2.4. If you have your own SHS without all these, Dr. Elephant will not be able to gather some new metrics.

Internally we are using Spark 2.3 Spark History Server with the above PR applied. There are some other patches for some SHS issues, but those will not be blockers for using Dr. Elephant.

zhouyejoe avatar Mar 12 '18 19:03 zhouyejoe

Thank you @zhouyejoe for this information - really helpful as we will be checking out Dr. Elephant soon too.

Tagar avatar Mar 12 '18 19:03 Tagar

@zhouyejoe does your team/Linkedin have any plans to open source the Spark 2.3/4 patches you mentioned above?

helenfeng737 avatar Mar 19 '19 04:03 helenfeng737

@zhouyejoe does your team/Linkedin have any plans to open source the Spark 2.3/4 patches you mentioned above?

Those PRs has been merged into Spark trunk, please take a look at the comments in https://issues.apache.org/jira/browse/SPARK-23206 for more details.

zhouyejoe avatar Mar 27 '19 01:03 zhouyejoe