spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-40505][K8S] Remove min heap setting for executor in entrypoint.sh

Open bryanck opened this issue 3 years ago • 1 comments

What changes were proposed in this pull request?

This PR removes the min heap setting (-Xms) from the JVM args when starting the executor process.

Why are the changes needed?

Removing the min heap setting is consistent with the settings for the YARN executor process and allows the JVM to reduce the heap when needed to reduce memory consumption.

Does this PR introduce any user-facing change?

No

How was this patch tested?

We have this change deployed in our production environment to solve for OOM kills we were experiencing when running in K8S but were not seeing when running in YARN.

bryanck avatar Sep 20 '22 12:09 bryanck

Can one of the admins verify this patch?

AmplabJenkins avatar Sep 20 '22 20:09 AmplabJenkins

Thanks for pointing out memory settings @dongjoon-hyun , the goal of this PR is to bring more consistent memory usage behavior between YARN and k8s rather than to solve for a specific use case. If the community has determined a fixed heap size is optimal, then that should be set for YARN as well, unless there is some reasoning for the difference.

bryanck avatar Sep 23 '22 12:09 bryanck

You can propose to YARN committers as a YARN PR, but I believe we don't want to make any surprise to the existing YARN users when there is no issue there like you mentioned. Although consistency is good in general, there is a reason why we have separate resource manager modules like K8s, YARN, Mesos.

dongjoon-hyun avatar Sep 23 '22 14:09 dongjoon-hyun

Let me close this PR for now. We can continue our discussion on this PR.

dongjoon-hyun avatar Sep 23 '22 14:09 dongjoon-hyun

You mentioned there is a reason for the different modules, can you explain the reason k8s is configured to use a fixed heap size and YARN does not?

bryanck avatar Sep 23 '22 14:09 bryanck

I don't follow how a YARN PR will help? AFAIK, the executor heap is set by Spark

bryanck avatar Sep 23 '22 14:09 bryanck

I mean K8s module has been choosing the best practice in K8s control plane and I believe YARN module is the same situation I guess.

You mentioned there is a reason for the different modules, can you explain the reason k8s is configured to use a fixed heap size and YARN does not?

I seems that you misunderstand. I didn't claim it's good in YARN. Why would I claim something for the environment which I don't use?

I don't follow how a YARN PR will help?

The PR is responsible to provide a way to prove it.

If the community has determined a fixed heap size is optimal, then that should be set for YARN as well

dongjoon-hyun avatar Sep 23 '22 16:09 dongjoon-hyun