hadoop-connectors icon indicating copy to clipboard operation
hadoop-connectors copied to clipboard

The connector doesn't provide atomic file creations if a thread is interrupted

Open mahammadv opened this issue 4 years ago • 2 comments

We have encountered a scenario where the close method is not called on output streams of GCS connector, however the file itself is being flushed, so it becomes visible. The issue happens when the thread that does writes is interrupted. Apparently, the thread interruption triggers close method in the output stream. The expected behavior here is that the file will never be visible in case of any thread interruption. We expect this behavior to be fixed in GCS, so the atomic file creation is guaranteed by the output stream close. Please find the repro and stacktrace below.

Example code to reproduce the issue:

Path path = new Path("gs://...");
FileSystem fs = path.getFileSystem(conf);
OutputStream out = fs.create(path);
Thread.currentThread().interrupt();
try {
   out.write("foo".getBytes());
   out.close();
} catch(IOException e) {
   Thread.interrupted();
   e.printStackTrace();
   Thread.sleep(1000); // Wait for the other thread to complete the request
   System.out.println(fs.exists(path)); // will output `true`
   System.out.println("content: " + IOUtils.toString(fs.open(path))); // The content is empty
}

Stacktrace of the issue:

  at java.io.PipedOutputStream.close(PipedOutputStream.java:175)
	  at java.nio.channels.Channels$WritableByteChannelImpl.implCloseChannel(Channels.java:469)
	  at java.nio.channels.spi.AbstractInterruptibleChannel$1.interrupt(AbstractInterruptibleChannel.java:165)
	  - locked <0xc2f> (a java.lang.Object)
	  at java.nio.channels.spi.AbstractInterruptibleChannel.begin(AbstractInterruptibleChannel.java:173)
	  at java.nio.channels.Channels$WritableByteChannelImpl.write(Channels.java:457)
	  - locked <0xc3b> (a java.lang.Object)
	  at com.google.cloud.hadoop.util.BaseAbstractGoogleAsyncWriteChannel.write(BaseAbstractGoogleAsyncWriteChannel.java:136)
	  - locked <0xc3c> (a com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl$2)
	  at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
	  at java.nio.channels.Channels.writeFully(Channels.java:101)
	  at java.nio.channels.Channels.access$000(Channels.java:61)
	  at java.nio.channels.Channels$1.write(Channels.java:174)

mahammadv avatar Sep 10 '21 07:09 mahammadv

@medb Please check this out, thanks!

mahammadv avatar Sep 10 '21 07:09 mahammadv

this is consistent with hadoop FS spec, which says when create(path) returns, exists(path) must hold https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html#FSDataOutputStream_create.28Path.2C_....29

the fact that s3 doesn't do this is a divergence there.

for atomicity, write to a different location and rename the file. that is atomic in GCS (but not s3 though the s3a connector)

steveloughran avatar Jan 21 '22 15:01 steveloughran