s3committer
s3committer copied to clipboard
Hadoop output committers for S3
Caused by: org.apache.hadoop.ipc.RemoteException: No lease on /tmp/application_1500052999350_0147/pending-uploads/_temporary/0/_temporary/attempt_20170722012104_0000_m_000003_3 (inode 84330): File does not exist. Holder DFSClient_NONMAPREDUCE_-1813506953_55 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3432) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3233) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3071) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031) at...
If the column partition values contains a space , the class Paths throws an exception
Thanks for this wonderful library. We want to try S3Committer to write Spark output directly to S3. And during our testing, we noticed that the job creates output files in...
Just wondering if my interpretation of some relevant docs and the code here is correct: - it seems like uploading a single large file in parts is possible / supported...
running a gradle build will crash since its missing a dependency. Updating the build.gradle with ``` buildscript { repositories { jcenter() maven { url "https://plugins.gradle.org/m2/" } } dependencies { classpath...
Assert.assertEquals("File sets should match", expectedFiles, actualFiles); expectedFiles have the same order as the actualFiles ``` java.lang.AssertionError: File sets should match expected: but was: ```
Hi, I'm using this committer in spark. When I run one job everything is perfect. When start multiple jobs (from different threads) I get this error so often (not always)....
Assert.assertTrue("MR job should succeed", mrJob.waitForCompletion(true)) returns a false ``` java.lang.AssertionError: MR job should succeed at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at com.netflix.bdp.s3.TestMRJob.testMRJob(TestMRJob.java:141) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at...
I hated the slow nature of hive on S3 on EMR so I wanted to use this committer. Unfortunately , I am hitting the issue shown below . Any idea...