filesystem_spec
filesystem_spec copied to clipboard
Use of '::' in file's name
I am investigating this issue in uproot, where we try to open a file that contains :: in the name of the file. As you can see, the expected behavior would be to correctly create the file indicated in the string, but it doesn't seem to be the case. The reason is in this function
https://github.com/fsspec/filesystem_spec/blob/fe59f48363029a0da68a0de534bf63e1c42e5a81/fsspec/core.py#L331-L366
where, it seems to me, the case of :: being part of the file name is not considered, and it is only treated as a protocol separator.
One idea to adapt the code would be the following:
if "::" in path:
x = re.compile(".*[^a-z]+.*") # test for non protocol-like single word
bits = []
for p in path.split("::"):
# Check if part looks like a protocol or URL
if "://" in p or x.match(p) or p in known_implementations:
bits.append(p)
else:
# If not, assume it is part of the file name
bits.append(p + "://")
# If no part matches a known protocol, treat the entire path as a file name
if not any(b for b in bits if b.strip("://") in known_implementations):
bits = [path]
else:
bits = [path]
This fixes Jim's reproducer, but breaks a few tests, making me wonder if this behavior is intentional.
The question is thus the following: should a logic be implemented in fsspec to handle the case in which :: is part of the file name or should we implement a check in uproot where we raise an error if :: is not used as a protocol separator?
Hi, is there any update on this?
Maybe we could come up with an escape for the separator string, like "/::/" or something, which can make everything work at some small cost to the user.
Checking in on this; we've hit it again in ServiceX.