SynapseML icon indicating copy to clipboard operation
SynapseML copied to clipboard

Fix: Corrects $SPARK_HOME errors in spark container

Open brockneedscoffee opened this issue 4 years ago β€’ 11 comments

When installing spark via miniconda the spark home is not set so you cannot run spark in AML. You will get errors when you try to set the spark context or if you set the framework to pyspark in an AML pipeline step. This PR installs spark, sets the home, and installs open mpi dependency.

brockneedscoffee avatar Mar 16 '21 19:03 brockneedscoffee

πŸ’– Thanks for opening your first pull request! πŸ’– We use semantic commit messages to streamline the release process. Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix. This helps us to create release messages and credit you for your hard work! Examples of commit messages with semantic prefixes:

  • fix: Fix LightGBM crashes with empty partitions
  • feat: Make HTTP on Spark back-offs configurable
  • docs: Update Spark Serving usage
  • build: Add codecov support
  • perf: improve LightGBM memory usage
  • refactor: make python code generation rely on classes
  • style: Remove nulls from CNTKModel
  • test: Add test coverage for CNTKModel

Make sure to check out the developer guide for guidance on testing your change.

welcome[bot] avatar Mar 16 '21 19:03 welcome[bot]

/azp run

mhamilton723 avatar Mar 16 '21 19:03 mhamilton723

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Mar 16 '21 19:03 azure-pipelines[bot]

Codecov Report

Merging #1009 (a17834c) into master (2c223f6) will increase coverage by 0.35%. The diff coverage is n/a.

:exclamation: Current head a17834c differs from pull request most recent head 0040785. Consider uploading reports for the commit 0040785 to get more accurate results Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1009      +/-   ##
==========================================
+ Coverage   84.13%   84.49%   +0.35%     
==========================================
  Files         201      199       -2     
  Lines        9306     9172     -134     
  Branches      554      543      -11     
==========================================
- Hits         7830     7750      -80     
+ Misses       1476     1422      -54     
Impacted Files Coverage Ξ”
...om/microsoft/ml/spark/train/AutoTrainedModel.scala 50.00% <0.00%> (-35.72%) :arrow_down:
.../org/apache/spark/ml/param/UntypedArrayParam.scala 37.50% <0.00%> (-20.40%) :arrow_down:
...n/scala/com/microsoft/ml/spark/stages/Lambda.scala 80.00% <0.00%> (-13.34%) :arrow_down:
...a/com/microsoft/ml/spark/io/http/HTTPClients.scala 76.66% <0.00%> (-6.67%) :arrow_down:
...icrosoft/ml/spark/featurize/CleanMissingData.scala 88.88% <0.00%> (-4.96%) :arrow_down:
.../com/microsoft/ml/spark/stages/ClassBalancer.scala 81.81% <0.00%> (-4.85%) :arrow_down:
...icrosoft/ml/spark/automl/TuneHyperparameters.scala 73.91% <0.00%> (-4.35%) :arrow_down:
.../microsoft/ml/spark/vw/VowpalWabbitRegressor.scala 70.00% <0.00%> (-2.73%) :arrow_down:
.../com/microsoft/ml/spark/automl/FindBestModel.scala 85.00% <0.00%> (-1.54%) :arrow_down:
.../microsoft/ml/spark/core/utils/ModelEquality.scala 85.71% <0.00%> (-1.25%) :arrow_down:
... and 71 more

Continue to review full report at Codecov.

Legend - Click here to learn more Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data Powered by Codecov. Last update 2c223f6...0040785. Read the comment docs.

codecov[bot] avatar Mar 16 '21 19:03 codecov[bot]

/azp run

brockneedscoffee avatar Mar 16 '21 20:03 brockneedscoffee

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Mar 16 '21 20:03 azure-pipelines[bot]

/azp run

brockneedscoffee avatar Mar 25 '21 04:03 brockneedscoffee

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Mar 25 '21 04:03 azure-pipelines[bot]

/azp run

brockneedscoffee avatar Mar 25 '21 04:03 brockneedscoffee

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Mar 25 '21 04:03 azure-pipelines[bot]

@brockneedscoffee - looks good but we dont test docker notebook as part of the build yet so could you manually test these out?

Also could you group all the run steps together into a single run step and put ENV stuff in top of docker. Putting all of the run steps together (and deleting ant .tgzs you download in these steps) makes the docker image considerably smaller thanks!

mhamilton723 avatar Mar 28 '21 17:03 mhamilton723