s3fs-nio
s3fs-nio copied to clipboard
s3fs-nio trying to put back an object downloaded from the bucket
Bug Description
s3fs-nio trying to put back an object downloaded from the bucket as per stack trace:
at org.carlspring.cloud.storage.s3fs.S3SeekableByteChannel.sync(S3SeekableByteChannel.java:182)
at org.carlspring.cloud.storage.s3fs.S3SeekableByteChannel.close(S3SeekableByteChannel.java:146)
at java.base/sun.nio.ch.ChannelInputStream.close(ChannelInputStream.java:279)
Steps To Reproduce
FileSystem fs = FileSystems.newFileSystem(URI.create(file), new HashMap<>());
Path p = fs.getPath(new URI(file).getPath());
Files.readAllBytes(p);
Expected Behavior
The object is downloaded without sync back its content
Environment
s3fs-nioversion: 1.0.0- OS: not relevant
- JDK: not relevant
@stefanofornari ,
Thanks for reporting and for providing sample code to reproduce it!
How large is the file?
Also, just last night we released version 1.0.1. Would you be able to try it with that version as well?
If you could knock up a simple test case as well, that'll help figure out things quicker.
size of the flle: 262149 bytes
It happens with 1.0.1 as well. it has to do with a temporary file in S3SeekableByteChannel:
public void close()
throws IOException
{
try
{
if (!seekable.isOpen())
{
return;
}
seekable.close();
if (options.contains(StandardOpenOption.DELETE_ON_CLOSE))
{
path.getFileSystem().provider().delete(path);
return;
}
if (options.contains(StandardOpenOption.READ) && options.size() == 1)
{
return;
}
if(this.tempFile != null)
{
sync();
}
}
finally
{
if(tempFile != null)
{
Files.deleteIfExists(tempFile);
}
}
}
Hey @stefanofornari,
Thanks for reporting this. Are you using S3 or StoreJ ?
Hey, I am using StorJ, which I think uses MinIO
I'm not entirely convinced this is a bug in s3fs-nio, because we have a test case that covers this exact behavior:
https://github.com/carlspring/s3fs-nio/blob/1f33c9d3da564ddebbe305a9a86ebeb270e34e31/src/test/java/org/carlspring/cloud/storage/s3fs/FilesIT.java#L762-L847
And it's working fine in S3. Maybe it's something specific to StoreJ?
If you could create a reproducible test case with S3 that would be helpful.
Hi Steve, I am not sure the code above represents the same use case. It actually writes a file, while in he code I have provided, I read a file. Still I see that sync() call that pushes the file I have read back to the server. I can provide full log if needed.
PS: generally speaking, if you want s3fs-nio to be used with any S3 compliant service, even specific behaviour of a service should be managed by the library itself (but as I mentioned, StorJ uses MinIO). Additionally, as I pointed out in https://github.com/carlspring/s3fs-nio/issues/715#issuecomment-1569844178, this behaviour is due to library code; StorJ may trigger a corner case, but I would not be so quick in claiming it is not a bug.
Maybe you've missed this part of the test:
String first = "first-write";
Files.write(s3file, first.getBytes());
assertThat(Files.readAllBytes(s3file)).isEqualTo(first.getBytes());
^^^^
In essence the test simulates creating a file at s3, checks if the upstream file contains the expected bytes and then replaces it a few times.
We aim to be AWS S3 compliant. However this does not mean "any-cloud-solution-claiming-to-be-s3-complaint". MinIO also has it's own deviations in their implementation. For the better part it's compliant, but it's not 100% a 1 to 1 comparison. :)
If you could try it out with S3 and maybe provide a test case in a repo that reproduces the problem -- that would be much appreciated.
Ciao, @stefanofornari ! :)
StorJ may trigger a corner case, but I would not be so quick in claiming it is not a bug.
We are in no rush to dismiss this issue. Optimally, we would like for s3fs-nio to be able to handle different implementations of S3. However, it's currently based on the AWS Java library, so if it's not supported out of the box by it, somebody would have to invest some time into investigating what the differences between MinIO, StorJ and/or other implementations could be and then provide a fix for it, or, at least, clearly outline what will need to be done. This is why we're trying to understand the problem.
Contributions are very welcome, as we are a small team.
Maybe you've missed this part of the test:
String first = "first-write"; Files.write(s3file, first.getBytes()); assertThat(Files.readAllBytes(s3file)).isEqualTo(first.getBytes()); ^^^^
I think I did not make myself understood, sorry. All tests pass with StorJ. What I mean is that after a file is read with Files.readAllBytes(p); (and the download works well), the same blob is uploaded back to the bucket when the channel is closed. This upload is highly undesirable because it consumes bandwidth, it takes time and most importantly, it touches the file on the bucket. This is because, as explained in https://github.com/carlspring/s3fs-nio/issues/715#issuecomment-1569844178, when closing the channel sync() is called, which makes the file uploaded. This looks to me a bug in the library. To see it, you need to raise the level of the logs produced by the AWS SDK. Let me know if you want the logs of the transaction, but it should be fairly easy to reproduce with the code in the original description. Let me know if I can be of any help.
Thanks for further elaborating! I don't have the time to look into this right now, but it does sound like it needs to be checked out.
@stefanofornari ,
Would you be interested further investigating it and attempting to fix it?