flink
flink copied to clipboard
[FLINK-28915] Flink Native k8s mode jar localtion support s3 schema.
What is the purpose of the change
Kerbernetes Native K8s Application Mode and StandAlone Application Mode support fetching jar from DFS schema(S3, OSS, HDFS, etc.).
Brief change log
- Fetch jar from DFS(S3, OSS, HDFS, etc.) before starting flink cluster.
Verifying this change
This change added tests and can be verified as follows:
- Remove
testDeployApplicationClusterWithNonLocalSchema
test - Added tests that fetch jar from http schema
- Added tests that fetch jar from file schema
- Added test that create emptyDir for saving user artifacts
- Manually verify the local, file, oss, HDFS with Kerberos, S3 resource.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
@Public(Evolving)
: no - The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: Native Kubernetes Application Mode: yes
- The S3 file system connector: no
Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? docs
CI report:
- c0efcfe43c465daf45fff83154d7469fbd1e4f93 Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:-
@flinkbot run azure
re-run the last Azure build
I create a new pr based on master. Please help take a look when you are free @wangyang0918 @Aitozi . thx.
@SwimSweet Sorry for the late response. I believe this PR could work. However, my biggest concert is that it could only work for native K8s application. AFAIK, the Yarn application mode and standalone mode should also benefit from this.
@wangyang0918 I will work for it to support Yarn applicaiton mode and standalone mode. I found that Yarn application mode already has similar features. This feature provides yarn.provided.lib.dirs and yarn.provided.usrlib.dir parameters. But it seems that this feature only supports Hadoop file system?
@SwimSweet Yes. The user jar for Yarn application mode could only be a HDFS file. However, I believe it is enough since using Yarn distributed cache is more appropriate than downloading via http or Flink filesystem directly. This also means that we just need to add the support for standalone mode in this PR.
@flinkbot run azure
@flinkbot run azure
@wangyang0918 I have finished the work of StandAlone mode. Please take a look again.
@wangyang0918 Please take a look..thx..
Hi @SwimSweet, given your lack of response @ferenc-csaky offered to build on your work and take it forward based on my comments above. Hope that is OK, he is planning to post an updated PR next week.
Closing this as the relevant work has been merged as part of #24065.