hadoop icon indicating copy to clipboard operation
hadoop copied to clipboard

HADOOP-11867: Add gather API to file system.

Open omalley opened this issue 5 years ago • 10 comments

Add API to PositionedReadable to have an asynchronous gather API.

omalley avatar Feb 03 '20 23:02 omalley

The benchmark numbers are posted on the jira.

You'll need to help with the spec that you've developed in fsdatainputstream.md. Fundamentally, the new call is logically the same the input ranges being read using pread in an undefined order. When the CompletableFuture<ByteBuffer> returned from range.getData() is done, the data must be in the buffer.

And yes, I believe this structure will work well for ORC (and likely Parquet).

omalley avatar Feb 04 '20 22:02 omalley

@omalley -you still working on this?

steveloughran avatar Jul 23 '20 13:07 steveloughran

hey @omalley -thanks for the update. Could you do anything with the fields in AsyncBenchmark, as they are flooding yetus

Unused field:AsyncBenchmark_BufferChoice_jmhType_B3.java

steveloughran avatar Sep 21 '20 10:09 steveloughran

Yeah, I just added a suppression file for findbugs that hopefully will make Yetus happy. Sigh findbugs and generated code are not a good combination.

omalley avatar Sep 21 '20 15:09 omalley

:broken_heart: -1 overall

Vote Subsystem Runtime Comment
+0 :ok: reexec 0m 24s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 1s No case conflicting files found.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 :ok: mvndep 3m 36s Maven dependency ordering for branch
+1 :green_heart: mvninstall 28m 40s trunk passed
+1 :green_heart: compile 20m 36s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 :green_heart: compile 17m 19s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 :green_heart: checkstyle 2m 48s trunk passed
+1 :green_heart: mvnsite 21m 4s trunk passed
+1 :green_heart: shadedclient 39m 22s branch has no errors when building and testing our client artifacts.
+1 :green_heart: javadoc 6m 26s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 :green_heart: javadoc 6m 57s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+0 :ok: spotbugs 2m 5s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 :green_heart: findbugs 37m 57s trunk passed
_ Patch Compile Tests _
+0 :ok: mvndep 0m 33s Maven dependency ordering for patch
+1 :green_heart: mvninstall 24m 8s the patch passed
+1 :green_heart: compile 20m 3s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 :green_heart: javac 20m 3s the patch passed
+1 :green_heart: compile 17m 24s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 :green_heart: javac 17m 24s the patch passed
-0 :warning: checkstyle 2m 50s root: The patch generated 28 new + 90 unchanged - 4 fixed = 118 total (was 94)
+1 :green_heart: mvnsite 17m 55s the patch passed
+1 :green_heart: whitespace 0m 0s The patch has no whitespace issues.
+1 :green_heart: xml 0m 7s The patch has no ill-formed XML file.
+1 :green_heart: shadedclient 15m 24s patch has no errors when building and testing our client artifacts.
+1 :green_heart: javadoc 6m 29s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 :green_heart: javadoc 6m 54s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 :green_heart: findbugs 39m 26s the patch passed
_ Other Tests _
-1 :x: unit 577m 2s root in the patch passed.
-1 :x: asflicense 1m 50s The patch generated 1 ASF License warnings.
895m 41s
Reason Tests
Failed junit tests hadoop.yarn.applications.distributedshell.TestDistributedShell
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer
hadoop.yarn.server.resourcemanager.TestRMRestart
hadoop.yarn.sls.TestReservationSystemInvariants
hadoop.hdfs.TestFileChecksum
hadoop.hdfs.TestSnapshotCommands
hadoop.hdfs.TestFileChecksumCompositeCrc
hadoop.hdfs.TestDFSClientRetries
hadoop.hdfs.TestStripedFileAppend
hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier
hadoop.crypto.key.kms.server.TestKMS
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1830/12/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/1830
JIRA Issue HADOOP-11867
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux a64523bb9bef 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 7fae4133e05
Default Java Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
checkstyle https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1830/12/artifact/out/diff-checkstyle-root.txt
unit https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1830/12/artifact/out/patch-unit-root.txt
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1830/12/testReport/
asflicense https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1830/12/artifact/out/patch-asflicense-problems.txt
Max. process+thread count 3731 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-common-project . hadoop-common-project/benchmark U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1830/12/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Sep 23 '20 09:09 hadoop-yetus

I am trying to compile and run the benchmark added.

I am using this command java -cp target/hadoop-benchmark-3.4.0-SNAPSHOT-uber.jar org.apache.hadoop.benchmark.AsyncBenchmark /tmp/benchmark and seeing this failure java.lang.NoSuchMethodError: java.nio.ByteBuffer.flip()Ljava/nio/ByteBuffer; at org.apache.hadoop.benchmark.AsyncBenchmark$FileRangeCallback.completed(AsyncBenchmark.java:185) at org.apache.hadoop.benchmark.AsyncBenchmark$FileRangeCallback.completed(AsyncBenchmark.java:161

Also when I try to run the same using IDE while selecting the JRE to be Bundled, it works fine.

Anything specific I have to do before running the benchmark.

FYI related : https://stackoverflow.com/questions/61267495/exception-in-thread-main-java-lang-nosuchmethoderror-java-nio-bytebuffer-flip

Thanks

mukund-thakur avatar Sep 24 '20 09:09 mukund-thakur

@mukund-thakur : build and test with the same JDK; java 9+ added some overloaded methods to bytebuyffer. If code has been built against a newer JVM than the one you test against, you will get link problems.

Warning: some openjdk8 builds (Amazon Corretto) have the overloaded methods, so cannot be used to build things you intend to run elsewhere.

Recommend you set up JAVA_HOME to point to the java version you want, run maven builds on the command line

steveloughran avatar Sep 24 '20 09:09 steveloughran

@mukund-thakur : build and test with the same JDK; java 9+ added some overloaded methods to bytebuyffer. If code has been built against a newer JVM than the one you test against, you will get link problems.

Warning: some openjdk8 builds (Amazon Corretto) have the overloaded methods, so cannot be used to build things you intend to run elsewhere.

Recommend you set up JAVA_HOME to point to the java version you want, run maven builds on the command line

Thank @steveloughran . It works after setting java home explicitly to 1.8.

mukund-thakur avatar Sep 24 '20 14:09 mukund-thakur

I have one question. Why merging of ranges is not done for RawLocalFileSystem but done for ChecksumFileSystem?

mukund-thakur avatar Sep 24 '20 15:09 mukund-thakur

We're closing this stale PR because it has been open for 100 days with no activity. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you feel like this was a mistake, or you would like to continue working on it, please feel free to re-open it and ask for a committer to remove the stale tag and review again. Thanks all for your contribution.

github-actions[bot] avatar Dec 15 '25 00:12 github-actions[bot]