sshfs
sshfs copied to clipboard
Question regarding call of stat() for parent dir
Hello, while using sshfs in fsspec.open_files(), I discovered that stat() is called for the parent directory of the wanted files, even if it is already clear that this must be a directory. While this is most certainly not an issue for most cases, the sftp server I have to use behaves somewhat strange regarding this, as I get a permission error when trying to call stat() on these directories.
When using the default sftp implementation from fsspec there is no issue at all, so at least for me it seems that it should be possible without a call to stat(). Is there any way to achieve this with this library as well? I really like to use it because of performance reasons compared to sftp. Thank you!
Hi @Bizarious . Sounds like a bug, maybe you could pinpoint specific line in the code? If you are getting a permission error, I suppose you have a traceback for that laying around as well?
Sorry for the late reply, there were some external circumstances that prevented me from responding.
At first, thank you for the answer! A bit more context would be helpful as well I think:
I'm using fsspec.open_files()
with a url that looks like this one:
ssh://user:password@sftp_host/root/path/*.zip
Now it seems the filesystem calls stat on the directory path
(considering the example above) despite it should not be necessary. The relevant part of the trace looks like this:
File ".../lib/python3.10/site-packages/fsspec_sync/sync.py", line 128, in fsspec_sync
source_open_files: OpenFiles = fsspec.open_files(
File ".../lib/python3.10/site-packages/fsspec/core.py", line 282, in open_files
fs, fs_token, paths = get_fs_token_paths(
File ".../lib/python3.10/site-packages/fsspec/core.py", line 641, in get_fs_token_paths
paths = [f for f in sorted(fs.glob(paths)) if not fs.isdir(f)]
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 118, in wrapper
return sync(self.loop, func, *args, **kwargs)
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
raise return_result
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
result[0] = await coro
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 775, in _glob
allpaths = await self._find(
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 841, in _find
if withdirs and path != "" and await self._isdir(path):
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 652, in _isdir
return (await self._info(path))["type"] == "directory"
File ".../lib/python3.10/site-packages/sshfs/utils.py", line 27, in wrapper
return await func(*args, **kwargs)
File ".../lib/python3.10/site-packages/sshfs/spec.py", line 141, in _info
attributes = await channel.stat(path)
File ".../lib/python3.10/site-packages/asyncssh/sftp.py", line 4573, in stat
return await self._handler.stat(path, flags)
File ".../lib/python3.10/site-packages/asyncssh/sftp.py", line 2695, in stat
return cast(SFTPAttrs, await self._make_request(
File ".../lib/python3.10/site-packages/asyncssh/sftp.py", line 2454, in _make_request
result = self._packet_handlers[resptype](self, resp)
File ".../lib/python3.10/site-packages/asyncssh/sftp.py", line 2470, in _process_status
raise exc
asyncssh.sftp.SFTPPermissionDenied: Permission denied.
What I discovered using the debugger, was that fsspec splits the path in the _glob
function and calls _find()
on the directory, so /root/path/
in our case. find()
then calls _isdir()
on that path which in turn calls _info()
of the ssh filesystem, which leads to a call of stat()
to this directory, leading in a permission error in my case. The relevant line in sshfs would be 141 in sshfs/spec.py.
Of course we are talking about the async implementation of glob()
and find()
, but I compared it to the normal ones and they look mostly similar, especially the call to isdir()
.
I am not sure if there is anything that can be done inside the ssh implementation, but as I already mentioned the default sftp implementation does not have this problem. Please let me know what you think and if I missed anything! Thanks :)