commons-vfs AbstractFileObject: call doGetType() without holding the lock

AbstractFileObject: call doGetType() without holding the lock

Open MaxKellermann opened this issue 7 years ago • 18 comments

To reduce lock contention. doGetType() does not need to be protected, but does synchronous I/O and can run for a long time (if the NFS server is slow to respond). This would block all other calls to the same FileSystem instance.

Jun 20 '17 18:06 MaxKellermann

The Travis-CI failure looks unrelated to this PR:

rake
rake aborted!
No Rakefile found (looking for: rakefile, Rakefile, rakefile.rb, Rakefile.rb)

Jun 20 '17 19:06 MaxKellermann

Why was this closed?

May 02 '19 06:05 MaxKellermann

The PR was against trunk The trunk branch was deleted as all Commons projects use maser in preference to trunk. That automatically closed all PRs against trunk. I'll re-open this but you'll need to modify the PR to be against master rather than trunk.

May 02 '19 09:05 markt-asf

Thanks, I've rebased on master.

May 02 '19 09:05 MaxKellermann

Hm, I think this would be a good improvement, but I am not sure if all providers (especialyl external ones) are ready to deal with concurrent invocation of the do method. A more compatible change would be some sort of mechanism where only providers willing to opt in for this can use it?

May 17 '19 13:05 ecki

I am not sure if all providers (especialyl external ones) are ready to deal with concurrent invocation of the do method.

That would be a bug in those providers, wouldn't it? Because nowhere can I find a guarantee that there's a lock. It just happens to be here now, as an implementation detail.

A more compatible change would be some sort of mechanism where only providers willing to opt in for this can use it?

That would make the code more complex, add code duplication, which is error prone. And wouldn't lead anywhere, wouldn't get those (potential) bugs in providers fixed.

Jan 07 '21 11:01 MaxKellermann

I don’t think there was ever a clear specification on locking/concurrency, so must provides would have checked what the code does. But having said that, I don’t think that area was ever robust, so introducing yet another failure mode might not be a problem.

Did you checked the existing providers if they are fine with it?

Jan 07 '21 11:01 ecki

Most are - but look at the confusing commit which added synchronized to the FTP provider: cd9e19f8eb6de22a3a545064ab0e68254bd243a2 Without requiring providers to be thread-safe, this commit doesn't make sense.

Few providers are not thread-safe, like SFTP with a TOCTOU bug. This should be fixed by adding a narrow synchronized, i.e. not on the whole filesystem, only on that one file, to avoid blocking other threads working with other files. HDFS is not thread-safe, but is buggy in other ways; its doAttach() method gets called again and again, without checking if that has been done before already. The worst that can happen with SFTP and HDFS (the two bad plugins) unnecessary duplication of calls.

Jan 07 '21 11:01 MaxKellermann

The big issue with kind of PR is that it undoes a previous intentional fix which was not unit tested. This PR does not have a unit test either so we are just going back and forth between different behaviors.

Ideally we need to start with some docs on what the expectations should be on on the contracts provided by not only the interfaces but by classes like this abstract class which is reused all over.

IOW, I would not change anything until we establish or relearn what the ground rules and expectations are WRT thread-safety as it seems reasonable for this abstract class to enforce or help establish thread safety or the lack of it...

@rgoers any thoughts?

Jan 07 '21 14:01 garydgregory

I added commits for fixing the obvious TOCTOU bugs in SFTP and HDFS. The other provides look like they don't need to be fixed.

Jan 12 '21 19:01 MaxKellermann

Unit test failures because the HdfsFileObject class is a big fuckup ... it calls the internal method doAttach(), circumventing attach(), i.e. the base class AbstractFileObject can't keep the attached flag up to date and thus never calls doDetach().

Jan 12 '21 19:01 MaxKellermann

BTW thanks for using a branch, lets give us some time to discuss this, its painful to remebr all those undefined corners (again and again).

Jan 12 '21 20:01 ecki

BTW I always try to use a VFS manager single threaded, I am not completely sure but most of the code looks like this was the intended use anyway :)

Jan 12 '21 20:01 ecki

BTW I always try to use a VFS manager single threaded, I am not completely sure but most of the code looks like this was the intended use anyway :)

My employer has a rather large Tomcat application which uses commons-vfs, heavily multi-threaded of course. Not my department (I don't do Java usually), but I sometimes helped with debugging, and commons-vfs was often a big problem. It pretends to be thread-safe, but sometimes locks too much (addressed by this PR initially), and sometimes locks too little (now addressed by this PR as well). Everything is wrong with commons-vfs's thread-safety, and I'm trying to help fix it.

Jan 12 '21 20:01 MaxKellermann

Phew, after many iterations, I think I have found a solution for HdfsFileObject. That class was confusing and broken, and I think it's better now - faster and thread-safe.

Jan 12 '21 21:01 MaxKellermann

I am trying to digest this into manageable chunks and I see a standalone edit to SftpFileObject that adds synchronized to three method declarations. Those three methods deal with the mutable and thread-unsafe com.jcraft.jsch.SftpATTRS. Why just those three methods? I do not know but it seems that all access to this SftpATTRS should be synchronized, so I added synchronized to the methods that read or write the SftpATTRS instance.

Jul 15 '21 20:07 garydgregory

Ping?

Apr 20 '22 23:04 garydgregory

Pong?

Apr 21 '22 03:04 MaxKellermann

commons-vfs commons-vfs copied to clipboard

AbstractFileObject: call doGetType() without holding the lock

commons-vfs
commons-vfs copied to clipboard