linkis
linkis copied to clipboard
[WIP]Enable customized and isolated python environment for Pyspark
What is the purpose of the change
Allows user to specify the isolated environment for pyspark, in case there is a need to utilize special python packages or latest one, like pyArrow, or pandas.
Related issues/PRs
Related issues: #3396
Brief change log
-
Allow user to specify the python environment from global setting via UI;

-
The three parameters(example from bellow) will take effect when pyspark initGateway;
spark.yarn.dist.archives=hdfs://acluster/user/spark/python-env/py.tar.gz#environment
spark.pyspark.python=./environment/bin/python
spark.pyspark.driver.python=/usr/local/python-env/py/bin/python
Checklist
- [x] I have read the Contributing Guidelines on pull requests.
- [x] I have explained the need for this PR and the problem it solves
- [x] I have explained the changes or the new features added to this PR
- [ ] I have added tests corresponding to this change
- [ ] I have updated the documentation to reflect this change
- [ ] I have verified that this change is backward compatible (If not, please discuss on the Linkis mailing list first)
- [ ] If this is a code change: I have written unit tests to fully verify the new behavior.
Codecov Report
Merging #3525 (94851e5) into dev-1.3.1 (4874820) will increase coverage by
0.08%. The diff coverage is0.00%.
:exclamation: Current head 94851e5 differs from pull request most recent head d2df0e5. Consider uploading reports for the commit d2df0e5 to get more accurate results
@@ Coverage Diff @@
## dev-1.3.1 #3525 +/- ##
===============================================
+ Coverage 14.14% 14.22% +0.08%
- Complexity 1512 1523 +11
===============================================
Files 1046 1036 -10
Lines 38647 38369 -278
Branches 5426 5427 +1
===============================================
- Hits 5465 5459 -6
+ Misses 32422 32149 -273
- Partials 760 761 +1
| Impacted Files | Coverage Δ | |
|---|---|---|
| ...engineplugin/spark/config/SparkConfiguration.scala | 93.54% <ø> (-0.21%) |
:arrow_down: |
| ...ineplugin/spark/executor/SparkPythonExecutor.scala | 0.00% <0.00%> (ø) |
|
| ...a/org/apache/linkis/scheduler/queue/Consumer.scala | 85.71% <0.00%> (-14.29%) |
:arrow_down: |
| ...orcode/LinkisGwAuthenticationErrorCodeSummary.java | 63.63% <0.00%> (-8.59%) |
:arrow_down: |
| ...data/errorcode/LinkisMetadataErrorCodeSummary.java | 59.09% <0.00%> (-7.58%) |
:arrow_down: |
| ...s/cs/errorcode/LinkisCsServerErrorCodeSummary.java | 58.82% <0.00%> (-2.72%) |
:arrow_down: |
| ...s/scheduler/queue/fifoqueue/FIFOUserConsumer.scala | 36.79% <0.00%> (-1.89%) |
:arrow_down: |
| ...ugin/jdbc/executor/JDBCMultiDatasourceParser.scala | 45.86% <0.00%> (-0.71%) |
:arrow_down: |
| ...main/scala/org/apache/linkis/rpc/RPCMapCache.scala | 0.00% <0.00%> (ø) |
|
| .../org/apache/linkis/rpc/transform/RPCConsumer.scala | 0.00% <0.00%> (ø) |
|
| ... and 91 more |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Run 'mvn spotless:apply' to fix these violations.