TEZ-4580: Slow preemption of new containers when re-use is enabled
When container reuse is enabled, preemption of lower priority containers that are not yet assigned to task, takes long time as they are released one at a time, and not the number of containers based when tez.am.preemption.percentage is high added in https://issues.apache.org/jira/browse/TEZ-1742.
Further investigation lead to following conclusion:
-
Warn log / Assertion error thrown because in preemptIfNeeded(), when releasing new containers, the loop counter is being decremented with each
releaseUnassignedContainers, leading to looping only half number of times. By using another counter, assertion passes because of condition method returns with checkif (numPendingRequestsToService < 1) {. -
In releaseContainer(), the container is not getting removed from
delayedContainersqueue and only fromheldContainersmap, hence same container is being picked up for release in every iteration till next cycle ofDelayedContainerManagerfinds out that the container is not inheldContainersand skips it with logSkipping delayed container as container is no longer running, containerId=...
This change adds a method in DelayedContainerManager to allow removal of delayed container and invokes it in releaseContainer method, which so far only removed it from heldContainers map.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
| _ Prechecks _ | |||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 15m 50s | master passed |
| +1 :green_heart: | compile | 0m 26s | master passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu322.04 |
| +1 :green_heart: | compile | 0m 22s | master passed with JDK Private Build-1.8.0_422-8u422-b05-1~22.04-b05 |
| +1 :green_heart: | checkstyle | 1m 21s | master passed |
| +1 :green_heart: | javadoc | 0m 31s | master passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu322.04 |
| +1 :green_heart: | javadoc | 0m 16s | master passed with JDK Private Build-1.8.0_422-8u422-b05-1~22.04-b05 |
| +0 :ok: | spotbugs | 1m 20s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 1m 19s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 0m 15s | the patch passed |
| +1 :green_heart: | compile | 0m 17s | the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu322.04 |
| +1 :green_heart: | javac | 0m 17s | the patch passed |
| +1 :green_heart: | compile | 0m 14s | the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~22.04-b05 |
| +1 :green_heart: | javac | 0m 14s | the patch passed |
| -0 :warning: | checkstyle | 0m 13s | tez-dag: The patch generated 1 new + 146 unchanged - 0 fixed = 147 total (was 146) |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 7s | the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu322.04 |
| +1 :green_heart: | javadoc | 0m 7s | the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~22.04-b05 |
| +1 :green_heart: | findbugs | 0m 46s | the patch passed |
| _ Other Tests _ | |||
| +1 :green_heart: | unit | 4m 25s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 15s | The patch does not generate ASF License warnings. |
| 27m 34s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/1/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/374 |
| JIRA Issue | TEZ-4580 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile |
| uname | Linux 7ae9360fbab1 5.15.0-117-generic #127-Ubuntu SMP Fri Jul 5 20:13:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 607b2bc22 |
| Default Java | Private Build-1.8.0_422-8u422-b05-1~22.04-b05 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu322.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~22.04-b05 |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/1/artifact/out/diff-checkstyle-tez-dag.txt |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/1/testReport/ |
| Max. process+thread count | 201 (vs. ulimit of 5500) |
| modules | C: tez-dag U: tez-dag |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/1/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
@abstractdog can you please take a look when you are free. Thanks in advance! :)
thanks @himanshu-mishra for taking care of this
according to the description of TEZ-1742:
Tez YARN Task Scheduler currently preempts 1 running task at a time when a higher priority task is waiting and there are no available resources. When a large number of higher priority tasks are pending then it can take a long time to preempt the required number of lower priority tasks.
after checking and trying out the unit test and the fix here, it seems like this patch what's exactly fixes the abovementioned problem
apart from the assertion error fix, the unit test indeed proves that releaseAssignedContainer is called as many times as needed according to the percentage config in a single round, so this looks good to me
only left a minor comment regarding a null check
Thanks @simhadri-g , @abstractdog for review. I have incorporated feedback as following:
- Keep only happy path log
- Remove null check
- Change log from info to debug
Also adding a note that as the the changes fixes slow preemption, it leads to performance gain mentioned in TEZ-1742 when container re-use is enabled, which is enabled by default in Tez.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
| _ Prechecks _ | |||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 5m 34s | master passed |
| +1 :green_heart: | compile | 0m 22s | master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | compile | 0m 21s | master passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | checkstyle | 0m 53s | master passed |
| +1 :green_heart: | javadoc | 0m 25s | master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javadoc | 0m 12s | master passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +0 :ok: | spotbugs | 1m 13s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 1m 12s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 0m 14s | the patch passed |
| +1 :green_heart: | compile | 0m 16s | the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javac | 0m 16s | the patch passed |
| +1 :green_heart: | compile | 0m 15s | the patch passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | javac | 0m 15s | the patch passed |
| -0 :warning: | checkstyle | 0m 10s | tez-dag: The patch generated 1 new + 146 unchanged - 0 fixed = 147 total (was 146) |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 6s | the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javadoc | 0m 6s | the patch passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | findbugs | 0m 41s | the patch passed |
| _ Other Tests _ | |||
| +1 :green_heart: | unit | 4m 14s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 14s | The patch does not generate ASF License warnings. |
| 16m 3s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/2/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/374 |
| JIRA Issue | TEZ-4580 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile |
| uname | Linux dd3da0311850 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 1084699dd |
| Default Java | Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/2/artifact/out/diff-checkstyle-tez-dag.txt |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/2/testReport/ |
| Max. process+thread count | 227 (vs. ulimit of 5500) |
| modules | C: tez-dag U: tez-dag |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/2/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 14s | Docker mode activated. |
| _ Prechecks _ | |||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 7m 29s | master passed |
| +1 :green_heart: | compile | 0m 42s | master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | compile | 0m 40s | master passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | checkstyle | 0m 52s | master passed |
| +1 :green_heart: | javadoc | 0m 29s | master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javadoc | 0m 24s | master passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +0 :ok: | spotbugs | 1m 27s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 1m 25s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 0m 26s | the patch passed |
| +1 :green_heart: | compile | 0m 28s | the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javac | 0m 28s | the patch passed |
| +1 :green_heart: | compile | 0m 25s | the patch passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | javac | 0m 25s | the patch passed |
| -0 :warning: | checkstyle | 0m 21s | tez-dag: The patch generated 1 new + 146 unchanged - 0 fixed = 147 total (was 146) |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 9s | the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javadoc | 0m 9s | the patch passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | findbugs | 1m 10s | the patch passed |
| _ Other Tests _ | |||
| +1 :green_heart: | unit | 5m 6s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 11s | The patch does not generate ASF License warnings. |
| 21m 40s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/3/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/374 |
| JIRA Issue | TEZ-4580 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile |
| uname | Linux ea7ca84329e4 5.15.0-126-generic #136-Ubuntu SMP Wed Nov 6 10:38:22 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 1084699dd |
| Default Java | Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/3/artifact/out/diff-checkstyle-tez-dag.txt |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/3/testReport/ |
| Max. process+thread count | 217 (vs. ulimit of 5500) |
| modules | C: tez-dag U: tez-dag |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/3/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
@himanshu-mishra : minor checkstyle left, can you please address: ./tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestTaskScheduler.java:1042: // We don't want containers to be assigned to a task by delayedContainerManager as it invokes another preemption flow: Line is longer than 120 characters (found 121). [LineLength]
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
| _ Prechecks _ | |||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 6m 30s | master passed |
| +1 :green_heart: | compile | 0m 25s | master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | compile | 0m 21s | master passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | checkstyle | 0m 33s | master passed |
| +1 :green_heart: | javadoc | 0m 21s | master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javadoc | 0m 16s | master passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +0 :ok: | spotbugs | 1m 1s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 0m 59s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 0m 19s | the patch passed |
| +1 :green_heart: | compile | 0m 17s | the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javac | 0m 17s | the patch passed |
| +1 :green_heart: | compile | 0m 13s | the patch passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | javac | 0m 13s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 13s | the patch passed |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 7s | the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 |
| +1 :green_heart: | javadoc | 0m 8s | the patch passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| +1 :green_heart: | findbugs | 0m 47s | the patch passed |
| _ Other Tests _ | |||
| +1 :green_heart: | unit | 4m 15s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 14s | The patch does not generate ASF License warnings. |
| 16m 50s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/4/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/374 |
| JIRA Issue | TEZ-4580 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile |
| uname | Linux 0ae4067883a1 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / b95defcd7 |
| Default Java | Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/4/testReport/ |
| Max. process+thread count | 202 (vs. ulimit of 5500) |
| modules | C: tez-dag U: tez-dag |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-374/4/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.