spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-38425][K8S] Avoid possible errors due to incorrect file size or type supplied in hadoop conf

Open lyssg opened this issue 2 years ago • 11 comments

What changes were proposed in this pull request?

Skip mount files in hadoop conf if they are binary or very large to fit the configMap's max size.

Why are the changes needed?

Config map cannot hold binary files and there is also a limit on how much data a configMap can hold. And spark conf limit has been done in [SPARK-32221].

Does this PR introduce any user-facing change?

yes, in simple words avoids possible errors due to negligence (for example, placing a large file or a binary file in SPARK_CONF_DIR) and thus improves user experience.

How was this patch tested?

Actually testing in k8s cluster and existing tests.

lyssg avatar Feb 27 '22 09:02 lyssg

Can one of the admins verify this patch?

AmplabJenkins avatar Feb 27 '22 11:02 AmplabJenkins

yes

------------------ 原始邮件 ------------------ 发件人: "UCB @.>; 发送时间: 2022年2月27日(星期天) 晚上7:09 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [apache/spark] [K8S] Avoid possible errors due to incorrect file size or type supplied in hadoop conf (PR #35667)

Can one of the admins verify this patch?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

lyssg avatar Feb 27 '22 11:02 lyssg

@lyssg mind linking the JIRA into the PR title please? See also https://spark.apache.org/contributing.html

HyukjinKwon avatar Feb 28 '22 00:02 HyukjinKwon

@lyssg mind linking the JIRA into the PR title please? See also https://spark.apache.org/contributing.html

thanks, i will complete it.

lyssg avatar Feb 28 '22 01:02 lyssg

New test(s) could be added to org.apache.spark.deploy.k8s.features.HadoopConfDriverFeatureStepSuite

ok

lyssg avatar Mar 06 '22 13:03 lyssg

Could you address the previous comments, @lyssg ? In addition, please enable GitHub Action in your Apache Spark fork. Apache Spark community is using your GitHub resource quota and it's free.

ok, I am adding the test, i will enable GitHub Action.

lyssg avatar Apr 05 '22 08:04 lyssg

@HyukjinKwon Hi, i have enable GitHub Actions and make my branch based on the lastest master branch. Why the workflow run detection still failed?

lyssg avatar Apr 11 '22 16:04 lyssg

@lyssg https://lists.apache.org/thread/7627q61x7tooob7q5sn96xbvbcqkf2ms

martin-g avatar Apr 12 '22 06:04 martin-g

Could you rebase this PR to the master branch for Apache Spark 3.4, @lyssg ?

Also, cc @ScrapCodes

Hi, i have rebased this PR to the master branch

lyssg avatar Aug 15 '22 15:08 lyssg

Thank you for updates, @lyssg .

dongjoon-hyun avatar Aug 15 '22 22:08 dongjoon-hyun

@dongjoon-hyun , @martin-g , @ScrapCodes ,I have rebased my PR again. Could you take another look?

lyssg avatar Sep 02 '22 01:09 lyssg

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

github-actions[bot] avatar Dec 12 '22 00:12 github-actions[bot]