universal_pathlib icon indicating copy to clipboard operation
universal_pathlib copied to clipboard

Provide `key`/`object_name`/`blob` attribute in CloudPath

Open Dev-iL opened this issue 1 year ago • 3 comments

Suppose I have a remote path as follows:

gcs_path = GCSPath("gs://bucket-name/A/B/C/filename.tar.gz")

I'd like a way to get a UPath object that represents the location in the bucket, e.g.

UPath("A/B/C/filename.tar.gz")

I'd like to avoid my present workaround of "/".join(gcs_path.parts[1:]) since it's not immediately clear what this code is doing.

Possibly related to https://github.com/fsspec/universal_pathlib/issues/170

Dev-iL avatar Oct 21 '24 08:10 Dev-iL

Can you provide more context?

The following style for creating paths is implemented for s3, gcs and az object storage:

>>> import upath
>>> upath.UPath("A/B/C/filename.tar.gz", protocol="gs", bucket="bucket-name")
GCSPath('gs://bucket-name/A/B/C/filename.tar.gz')

ap-- avatar Oct 21 '24 11:10 ap--

I need the other way around.... I have a GCSPath object, now I'd like to extract the path itself (i.e. without the protocol and bucket) by using some of the methods/fields of the object. Is there a way to do that?

Dev-iL avatar Oct 21 '24 12:10 Dev-iL

So from my understanding, to stay in the google storage vocabulary, you'd want the OBJECT_NAME, whereas right now you can only retrieve the PATH_TO_RESOURCE = BUCKET_NAME/OBJECT_NAME.

This will become generally available once either relative_to behaviour is fixed (which requires some more thought before rolling out) or url chaining is implemented #28 (you can then use dirfs to remove the prefix from path.)

ap-- avatar Oct 22 '24 08:10 ap--

In a future version .key could be made available with the implementation below:

>>> import upath
>>> x = upath.UPath("gs://bucket/abc/efg/file.txt")
>>> x.path.removeprefix(x.anchor)
'abc/efg/file.txt'

More general, relative path behavior can be made use of via (upath>=0.3.0):

>>> import upath
>>> x = upath.UPath("gs://bucket/abc/efg/file.txt")
>>> y = upath.UPath("gs://bucket/")
>>> z = x.relative_to(y)
>>> z
<relative GCSPath 'abc/efg/file.txt'>
>>> str(z)
'abc/efg/file.txt'

ap-- avatar Oct 05 '25 16:10 ap--