hadoop icon indicating copy to clipboard operation
hadoop copied to clipboard

HADOOP-18400. Fix file split duplicating records from a succeeding split when reading BZip2 text files

Open hotcodemacha opened this issue 3 years ago • 1 comments
trafficstars

Description of PR

Fix file split duplicating records from a succeeding split when reading BZip2 text files.

JIRA - HADOOP-18400

How was this patch tested?

Added Unit tests.

For code changes:

  • [X] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • [ ] If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

hotcodemacha avatar Aug 11 '22 01:08 hotcodemacha

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 47s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 1s codespell was not available.
+0 :ok: detsecrets 0m 1s detect-secrets was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+0 :ok: mvndep 15m 29s Maven dependency ordering for branch
+1 :green_heart: mvninstall 28m 32s trunk passed
+1 :green_heart: compile 25m 15s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 :green_heart: compile 22m 0s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: checkstyle 4m 30s trunk passed
+1 :green_heart: mvnsite 3m 15s trunk passed
+1 :green_heart: javadoc 2m 29s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 :green_heart: javadoc 1m 59s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: spotbugs 4m 54s trunk passed
+1 :green_heart: shadedclient 24m 45s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 :ok: mvndep 0m 27s Maven dependency ordering for patch
+1 :green_heart: mvninstall 1m 43s the patch passed
+1 :green_heart: compile 24m 24s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 :green_heart: javac 24m 24s the patch passed
+1 :green_heart: compile 21m 56s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: javac 21m 56s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: checkstyle 4m 20s the patch passed
+1 :green_heart: mvnsite 3m 12s the patch passed
+1 :green_heart: javadoc 2m 19s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 :green_heart: javadoc 2m 0s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 :green_heart: spotbugs 5m 8s the patch passed
+1 :green_heart: shadedclient 25m 1s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 :green_heart: unit 18m 23s hadoop-common in the patch passed.
+1 :green_heart: unit 7m 28s hadoop-mapreduce-client-core in the patch passed.
+1 :green_heart: asflicense 1m 17s The patch does not generate ASF License warnings.
255m 10s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4732/1/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/4732
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 851d47b409b4 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 41861391e6ccc3f980d23014b655fd9e22e58ed9
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4732/1/testReport/
Max. process+thread count 1285 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4732/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Aug 11 '22 06:08 hadoop-yetus

Thanks @aajisaka for final review/merge and @saswata-dutta for your additional review.

hotcodemacha avatar Sep 19 '22 04:09 hotcodemacha