adlfs icon indicating copy to clipboard operation
adlfs copied to clipboard

`AzureBlobFileSystem._strip_protocol` strips leading slashes when container is missing

Open ap-- opened this issue 1 year ago • 0 comments

Hello,

I have a question regarding the behavior of _strip_protocol. When comparing s3fs, gcsfs and adlfs I observed that for urlpaths that are missing a container (netloc), leading slashes are lstripped. It seems that this was introduced in #332.

>>> import fsspec

# identical behavior for urlpaths with netloc
>>> fsspec.get_filesystem_class("s3")._strip_protocol("s3://bucket/a/b/c")
'bucket/a/b/c'
>>> fsspec.get_filesystem_class("gs")._strip_protocol("gs://bucket/a/b/c")
'bucket/a/b/c'
>>> fsspec.get_filesystem_class("az")._strip_protocol("az://bucket/a/b/c")
'bucket/a/b/c'

# different behavior for incorrect urlpaths
>>> fsspec.get_filesystem_class("s3")._strip_protocol("s3:///a/b/c")
'/a/b/c'
>>> fsspec.get_filesystem_class("gs")._strip_protocol("gs:///a/b/c")
'/a/b/c'
>>> fsspec.get_filesystem_class("az")._strip_protocol("az:///a/b/c")
'a/b/c'  # <- adlfs strips the leading slash

Currently az:///a/b/c and az://a/b/c are considered equivalent. Is this intentional? If not I can work on a PR.

Cheers, Andreas

ap-- avatar Aug 30 '24 18:08 ap--