s3contents
s3contents copied to clipboard
glob pattern in filename is unexpectedly expanded
Because s3fs (fsspec) supports some glob patterns (*
, **
, ?
, and some of []
), there are file names that should not be passed to self.fs.cp
or self.fs.rm
without escape.
To reproduce
Example 1.
- Add to files
test_file.txt
andtest?file.txt
. - A request to remove
test?file.txt
removes both files.
Example 2.
- Create directory
test?dir
. - Removing (or renaming) the directory
test?dir
fails (by RecursionError: maximum recursion depth exceeded while calling a Python object).
To fix the issue
For now, adding glob.escape
doesn't fix the issue because fsspec
does not interpret the pattern [?]
correctly.
https://github.com/fsspec/filesystem_spec/blob/2023.5.0/fsspec/asyn.py#L648
Instead, I configured the fs not to use glob by
class S3FileSystemNoGlob(s3fs.S3FileSystem):
async def _expand_path(self, path, recursive=False, maxdepth=None):
...
and turning off the if-branch https://github.com/fsspec/filesystem_spec/blob/2023.5.0/fsspec/asyn.py#L763-L772.