Custom subclasses now require a `fsspec` filesystem
Previously it used to be possible to create arbitrary UPath subclasses like this:
from upath import UPath
from upath.registry import register_implementation
class MyPath(UPath):
def exists(self):
return True
register_implementation("my", MyPath)
print(UPath("my://path").exists())
However, UPath now checks for and requires a corresponding fsspec filesystem in the UPath._parse_storage_options method, raising a ValueError if one is not found:
Traceback (most recent call last):
File "/home/josiah/code/universal_pathlib/./test.py", line 12, in <module>
print(UPath("my://path").exists())
^^^^^^^^^^^^^^^^^^
File "/home/josiah/code/universal_pathlib/upath/core.py", line 203, in __init__
self._storage_options = type(self)._parse_storage_options(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/josiah/code/universal_pathlib/upath/core.py", line 290, in _parse_storage_options
fs_cls: type[AbstractFileSystem] = get_filesystem_class(protocol)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/josiah/code/universal_pathlib/venv/lib/python3.11/site-packages/fsspec/registry.py", line 233, in get_filesystem_class
raise ValueError(f"Protocol not known: {protocol}")
ValueError: Protocol not known: my
Would you consider handling this case and returning an empty storage options dictionary if no fsspec filesystem is found?
...
@classmethod
def _parse_storage_options(
cls, urlpath: str, protocol: str, storage_options: Mapping[str, Any]
) -> dict[str, Any]:
"""Parse storage_options from the urlpath"""
try:
fs_cls: type[AbstractFileSystem] = get_filesystem_class(protocol)
except ValueError:
return {}
else:
pth_storage_options = fs_cls._get_kwargs_from_urls(urlpath)
return {**pth_storage_options, **storage_options}
...
Thanks, Josiah
Yes, this is intentional, because we can't really predict what a filesystem class could extract from the uri as storage_options.
There are multiple ways to achieve what you want, including the one you suggest.
Could you describe your intention behind creating a UPath subclass for the "my" protocol, without defining a new AbstractFileSystem subclass for the "my" protocol?
The protocol I'm creating a new subclass for is actually untitled: (I was just using my:path as an example). For my use case, UntitledPath objects don't have to do anything other than not exist.
When communicating with a LSP server, you use the file URI to signal which file you are referring to. Typically you use file:// URIs for this, but for the case of new unsaved files, you use untitled: as the URI scheme. As part of implementing LSP server support in euporie, I added a new UntitledPath class to use with untitled files to facilitate this.
It doesn't really matter, since as you say above it's very easy to work around. I just wasn't sure if this was a deliberate change or not.
Feel free to close this
Interesting! Thanks for sharing the use case.
cross-referencing: https://github.com/neovim/neovim/issues/21276 because there are quite a few useful links to more context.
So if the "untitled" protocol is supposed to be a placeholder for in-memory files not written to disk, wouldn't the "memory" filesystem be an ideal candidate for it?
from fsspec.registry import register_implementation as fsspec_register_implementation
from fsspec.implementations.memory import MemoryFileSystem
class UntitledFileSystem(MemoryFileSystem):
store = {}
pseudo_dirs = [""]
protocol = "untitled"
root_marker = "/"
@classmethod
def _strip_protocol(cls, path):
if path.startswith("untitled://"):
path = path[len("untitled://") :]
return super()._strip_protocol(path)
fsspec_register_implementation("untitled", UntitledFileSystem)
from upath import UPath
from upath.registry import register_implementation as upath_register_implementation
from upath.implementations.memory import MemoryPath
class UntitledPath(MemoryPath):
pass
upath_register_implementation("untitled", UntitledPath)
u = UPath("untitled:///abc/file.txt")
print(repr(u))