linkis icon indicating copy to clipboard operation
linkis copied to clipboard

[WIP]Enable customized and isolated python environment for Pyspark

Open saLeox opened this issue 3 years ago • 2 comments
trafficstars

What is the purpose of the change

Allows user to specify the isolated environment for pyspark, in case there is a need to utilize special python packages or latest one, like pyArrow, or pandas.

Related issues/PRs

Related issues: #3396

Brief change log

  • Allow user to specify the python environment from global setting via UI; image

  • The three parameters(example from bellow) will take effect when pyspark initGateway;

spark.yarn.dist.archives=hdfs://acluster/user/spark/python-env/py.tar.gz#environment
spark.pyspark.python=./environment/bin/python
spark.pyspark.driver.python=/usr/local/python-env/py/bin/python

Checklist

  • [x] I have read the Contributing Guidelines on pull requests.
  • [x] I have explained the need for this PR and the problem it solves
  • [x] I have explained the changes or the new features added to this PR
  • [ ] I have added tests corresponding to this change
  • [ ] I have updated the documentation to reflect this change
  • [ ] I have verified that this change is backward compatible (If not, please discuss on the Linkis mailing list first)
  • [ ] If this is a code change: I have written unit tests to fully verify the new behavior.

saLeox avatar Sep 23 '22 09:09 saLeox

Codecov Report

Merging #3525 (94851e5) into dev-1.3.1 (4874820) will increase coverage by 0.08%. The diff coverage is 0.00%.

:exclamation: Current head 94851e5 differs from pull request most recent head d2df0e5. Consider uploading reports for the commit d2df0e5 to get more accurate results

@@               Coverage Diff               @@
##             dev-1.3.1    #3525      +/-   ##
===============================================
+ Coverage        14.14%   14.22%   +0.08%     
- Complexity        1512     1523      +11     
===============================================
  Files             1046     1036      -10     
  Lines            38647    38369     -278     
  Branches          5426     5427       +1     
===============================================
- Hits              5465     5459       -6     
+ Misses           32422    32149     -273     
- Partials           760      761       +1     
Impacted Files Coverage Δ
...engineplugin/spark/config/SparkConfiguration.scala 93.54% <ø> (-0.21%) :arrow_down:
...ineplugin/spark/executor/SparkPythonExecutor.scala 0.00% <0.00%> (ø)
...a/org/apache/linkis/scheduler/queue/Consumer.scala 85.71% <0.00%> (-14.29%) :arrow_down:
...orcode/LinkisGwAuthenticationErrorCodeSummary.java 63.63% <0.00%> (-8.59%) :arrow_down:
...data/errorcode/LinkisMetadataErrorCodeSummary.java 59.09% <0.00%> (-7.58%) :arrow_down:
...s/cs/errorcode/LinkisCsServerErrorCodeSummary.java 58.82% <0.00%> (-2.72%) :arrow_down:
...s/scheduler/queue/fifoqueue/FIFOUserConsumer.scala 36.79% <0.00%> (-1.89%) :arrow_down:
...ugin/jdbc/executor/JDBCMultiDatasourceParser.scala 45.86% <0.00%> (-0.71%) :arrow_down:
...main/scala/org/apache/linkis/rpc/RPCMapCache.scala 0.00% <0.00%> (ø)
.../org/apache/linkis/rpc/transform/RPCConsumer.scala 0.00% <0.00%> (ø)
... and 91 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Sep 23 '22 10:09 codecov[bot]

Run 'mvn spotless:apply' to fix these violations.

casionone avatar Oct 16 '22 06:10 casionone