SynapseML
SynapseML copied to clipboard
Fix: Corrects $SPARK_HOME errors in spark container
When installing spark via miniconda the spark home is not set so you cannot run spark in AML. You will get errors when you try to set the spark context or if you set the framework to pyspark in an AML pipeline step. This PR installs spark, sets the home, and installs open mpi dependency.
π Thanks for opening your first pull request! π We use semantic commit messages to streamline the release process. Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix. This helps us to create release messages and credit you for your hard work! Examples of commit messages with semantic prefixes:
fix: Fix LightGBM crashes with empty partitionsfeat: Make HTTP on Spark back-offs configurabledocs: Update Spark Serving usagebuild: Add codecov supportperf: improve LightGBM memory usagerefactor: make python code generation rely on classesstyle: Remove nulls from CNTKModeltest: Add test coverage for CNTKModel
Make sure to check out the developer guide for guidance on testing your change.
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
Codecov Report
Merging #1009 (a17834c) into master (2c223f6) will increase coverage by
0.35%. The diff coverage isn/a.
:exclamation: Current head a17834c differs from pull request most recent head 0040785. Consider uploading reports for the commit 0040785 to get more accurate results
@@ Coverage Diff @@
## master #1009 +/- ##
==========================================
+ Coverage 84.13% 84.49% +0.35%
==========================================
Files 201 199 -2
Lines 9306 9172 -134
Branches 554 543 -11
==========================================
- Hits 7830 7750 -80
+ Misses 1476 1422 -54
| Impacted Files | Coverage Ξ | |
|---|---|---|
| ...om/microsoft/ml/spark/train/AutoTrainedModel.scala | 50.00% <0.00%> (-35.72%) |
:arrow_down: |
| .../org/apache/spark/ml/param/UntypedArrayParam.scala | 37.50% <0.00%> (-20.40%) |
:arrow_down: |
| ...n/scala/com/microsoft/ml/spark/stages/Lambda.scala | 80.00% <0.00%> (-13.34%) |
:arrow_down: |
| ...a/com/microsoft/ml/spark/io/http/HTTPClients.scala | 76.66% <0.00%> (-6.67%) |
:arrow_down: |
| ...icrosoft/ml/spark/featurize/CleanMissingData.scala | 88.88% <0.00%> (-4.96%) |
:arrow_down: |
| .../com/microsoft/ml/spark/stages/ClassBalancer.scala | 81.81% <0.00%> (-4.85%) |
:arrow_down: |
| ...icrosoft/ml/spark/automl/TuneHyperparameters.scala | 73.91% <0.00%> (-4.35%) |
:arrow_down: |
| .../microsoft/ml/spark/vw/VowpalWabbitRegressor.scala | 70.00% <0.00%> (-2.73%) |
:arrow_down: |
| .../com/microsoft/ml/spark/automl/FindBestModel.scala | 85.00% <0.00%> (-1.54%) |
:arrow_down: |
| .../microsoft/ml/spark/core/utils/ModelEquality.scala | 85.71% <0.00%> (-1.25%) |
:arrow_down: |
| ... and 71 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Ξ = absolute <relative> (impact),ΓΈ = not affected,? = missing dataPowered by Codecov. Last update 2c223f6...0040785. Read the comment docs.
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
@brockneedscoffee - looks good but we dont test docker notebook as part of the build yet so could you manually test these out?
Also could you group all the run steps together into a single run step and put ENV stuff in top of docker. Putting all of the run steps together (and deleting ant .tgzs you download in these steps) makes the docker image considerably smaller thanks!