Recursive download seems broken
When running fs.download(rpath, lpath, recursive=True)
I now get the following error:
IsADirectoryError: [Errno 21] Is a directory:
Is lpath a directory, does it exist? Can you please find more details, perhaps make a test case.
Hi Martin, so I have the same issue as Nicolas. Our code was working on gcsfs 0.6.0 and it fails on 0.6.1, with Python 3.6 (for me).
fs = GCSFileSystem(project=PROJECT)
fs.download(GCS_URI, f'/tmp/{dataset_name}/', recursive=True)
where GCS_URI is a directory on a GCS bucket and the lpath is indeed a directory on the local disk.
You can find where the error appears in the following trace:
File "/home/xxx/.local/share/virtualenvs/private-learning-lab-iw_HFsYy/lib/python3.6/site-packages/fsspec/spec.py", line 977, in download
return self.get(rpath, lpath, recursive=recursive, **kwargs)
File "/home/xxx/.local/share/virtualenvs/private-learning-lab-iw_HFsYy/lib/python3.6/site-packages/fsspec/spec.py", line 610, in get
with open(lpath, "wb") as f2:
So basically, ffspec tries to open a file which is actually a dir. I don't know why but it didn't happen with gcsfs 0.6.0.
Does that "dir" correspond to a real, existent key on in the bucket? We could possibly add an exception in fsspec (or here, although the same is true for s3fs), that zero-length files should not be written, or explicitly prune out paths that look like they contain deeper nested things.
Yes, the URI points to an existing key in the bucket. Our use case is very generic, and the issue should arise for any case with recursive=True. The weird thing is that it was perfectly fine on 0.6.0. I may spend some time tomorrow to understand which change produced that behavior.
If there is a key that has the same path as a directory, then this problem should be expected - but of course it would be better if it was worked around. I suspect in the previous version, the key-with-the-name-of-a-directory simply wasn't being returned by gcsfs, which is also wrong.