tez icon indicating copy to clipboard operation
tez copied to clipboard

TEZ-4365: Use Regex Pattern to Parse DAG ID String

Open belugabehr opened this issue 3 years ago • 5 comments

belugabehr avatar Dec 28 '21 16:12 belugabehr

:broken_heart: -1 overall

Vote Subsystem Runtime Comment
+0 :ok: reexec 16m 27s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+1 :green_heart: mvninstall 12m 59s master passed
+1 :green_heart: compile 0m 23s master passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 :green_heart: compile 0m 22s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: checkstyle 0m 53s master passed
+1 :green_heart: javadoc 0m 33s master passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 :green_heart: javadoc 0m 19s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+0 :ok: spotbugs 1m 0s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 :green_heart: findbugs 0m 57s master passed
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 13s the patch passed
+1 :green_heart: compile 0m 13s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 :green_heart: javac 0m 13s the patch passed
+1 :green_heart: compile 0m 11s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: javac 0m 11s the patch passed
+1 :green_heart: checkstyle 0m 8s the patch passed
-1 :x: whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 :green_heart: javadoc 0m 12s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 :green_heart: javadoc 0m 11s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: findbugs 0m 32s the patch passed
_ Other Tests _
+1 :green_heart: unit 0m 30s tez-common in the patch passed.
+1 :green_heart: asflicense 0m 13s The patch does not generate ASF License warnings.
36m 4s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/1/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/tez/pull/172
JIRA Issue TEZ-4365
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux bb1b434b01c7 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / c9b8e90db
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
whitespace https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/1/artifact/out/whitespace-eol.txt
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/1/testReport/
Max. process+thread count 91 (vs. ulimit of 5500)
modules C: tez-common U: tez-common
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/1/console
versions git=2.25.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

tez-yetus avatar Dec 28 '21 17:12 tez-yetus

:confetti_ball: +1 overall

Vote Subsystem Runtime Comment
+0 :ok: reexec 1m 37s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+1 :green_heart: mvninstall 14m 42s master passed
+1 :green_heart: compile 0m 27s master passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 :green_heart: compile 0m 23s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: checkstyle 0m 59s master passed
+1 :green_heart: javadoc 0m 35s master passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 :green_heart: javadoc 0m 24s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+0 :ok: spotbugs 1m 1s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 :green_heart: findbugs 0m 59s master passed
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 15s the patch passed
+1 :green_heart: compile 0m 15s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 :green_heart: javac 0m 15s the patch passed
+1 :green_heart: compile 0m 14s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: javac 0m 14s the patch passed
+1 :green_heart: checkstyle 0m 8s the patch passed
+1 :green_heart: whitespace 0m 0s The patch has no whitespace issues.
+1 :green_heart: javadoc 0m 13s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 :green_heart: javadoc 0m 13s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: findbugs 0m 38s the patch passed
_ Other Tests _
+1 :green_heart: unit 0m 32s tez-common in the patch passed.
+1 :green_heart: asflicense 0m 14s The patch does not generate ASF License warnings.
23m 44s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/2/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/tez/pull/172
JIRA Issue TEZ-4365
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux 8454a0c9589f 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / c9b8e90db
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/2/testReport/
Max. process+thread count 91 (vs. ulimit of 5500)
modules C: tez-common U: tez-common
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-172/2/console
versions git=2.25.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

tez-yetus avatar Dec 29 '21 04:12 tez-yetus

@belugabehr, can you go back and run the performance tests in https://issues.apache.org/jira/browse/TEZ-1526. It will be interesting to see how this performs after removing the performance optimizations.

jteagles avatar Dec 29 '21 04:12 jteagles

@jteagles I create a small driver with JMH:

Benchmark           Mode  Cnt        Score        Error  Units
TestSplit.master   thrpt   30  5324642.492 ± 228078.761  ops/s
TestSplit.tez4365  thrpt   30  1809324.533 ±  37792.272  ops/s

Quite a bit slower, but still an impressive 1,809,324 string per second on my dated hardware. Using regex provides for fewer lines of code and makes it more readable. But your call. If you're not accepting of it, consider the unit tests update.

belugabehr avatar Dec 30 '21 03:12 belugabehr

This code optimization was critically import as the the event thread spends a significant time parsing task/attempt ids to dispatch messages. I would hate to lose that. I can appreciate the simplicity of REGEX though. Perhaps the regex can be used to validate the manual parsing, as the manual parsing is more error prone. And improved testing is welcome.

This patch inspired a YARN ID parsing improvement that made significant improvements there as well. I've linked the jira for reference in the original. https://issues.apache.org/jira/browse/YARN-6768

jteagles avatar Dec 30 '21 04:12 jteagles