universal_pathlib icon indicating copy to clipboard operation
universal_pathlib copied to clipboard

FSSpecParser implementation for URI handling

Open brl0 opened this issue 4 years ago • 5 comments

Update: Python3.13 naming for "flavour" is now "parser", see ParserBase https://github.com/barneygale/pathlib-abc/blob/e429fd2ae079d3c623e687715a8133f4c4769ed2/pathlib_abc/init.py#L38

Issue

Starting new issue as requested based on comment in issue #26.

after digging deeper into the code, and exploring the pathlib implementation, I am wondering if UPath would benefit from having something like a _URI_Flavour derived from the pathlib._Flavour class. This might make it easier to have consistent path handling across the fsspec based implementations.

The base class: pathlib._Flavour

The posix implementation: pathlib._PosixFlavour

Here is a list of the members of the class for an idea of what may need to be implemented:

pathlib._PosixFlavour
  • altsep
  • casefold
  • casefold_parts
  • compile_pattern
  • gethomedir
  • has_drv
  • is_reserved
  • is_supported
  • join_parsed_parts
  • make_uri
  • parse_parts
  • pathmod
  • resolve
  • sep
  • splitroot

Of course it probably makes sense to base as much as possible on fsspec functionality, especially considering its ability for url chaining.
Here are some possibly related functions from fsspec.core:

  • _un_chain
  • url_to_fs
  • split_protocol
  • strip_protocol

brl0 avatar Aug 04 '21 20:08 brl0

A few notes regarding flavours:

With #114 coming up, future versions of universal_pathlib will probably derive from the pathlib.PathBase base class (probably available in python>=3.13). Support for older pythons (<3.13) will likely be provided through a backport of PathBase in ~pathlib2 (see: https://github.com/jazzband/pathlib2/issues/70#issuecomment-1652303469)~ pathlib-abc (http://github.com/barneygale/pathlib-abc)

As far as I understand it, custom "flavour" implementations would then be equivalent to overriding PathBase.pathmod (see: https://github.com/python/cpython/pull/106533) which is basically using a subset of the public interface of posixpath and ntpath.

This is relevant for providing support for urllib.parse.urljoin-like joining behavior.

ap-- avatar Aug 28 '23 19:08 ap--

See this message about the timeline and back/future-port of the ABC: https://github.com/drivendataorg/cloudpathlib/issues/347#issuecomment-1877496952

merwok avatar Jan 04 '24 17:01 merwok

It looks like we should implemenent our custom pathlib_abc.PathModuleBase for all Pythons when we start to rely on pathlib_abc

See: https://github.com/barneygale/pathlib_abc/issues/19

ap-- avatar Jan 17 '24 10:01 ap--

Note to self:

This should be relatively simple to achieve in the Python-3.12 support PR I have already drafted:

https://github.com/ap--/universal_pathlib/blob/2878bff22e32dbc5240766f51b3e7d31c00b4b9a/upath/core312plus.py#L38-L58

ap-- avatar Jan 17 '24 10:01 ap--

see: https://github.com/python/cpython/issues/114575

ap-- avatar Jan 28 '24 22:01 ap--