pathlib-abc icon indicating copy to clipboard operation
pathlib-abc copied to clipboard

Base class for `PathBase.stat()` return value?

Open barneygale opened this issue 2 years ago • 4 comments

Is it unreasonable or incorrect to expect users to instantiate an os.stat_result object? It's a little tricky.

barneygale avatar Dec 16 '23 20:12 barneygale

Proposal: add a status() method that returns a Status object, which would resemble an os.DirEntry and include methods like is_dir() and file_type().

PathBase.status() would be abstract, and PathBase.stat() would call self.status().as_stat_result().

Path.status() would call Status.from_stat_result(self.stat()), and Path.stat() would call os.stat() as it does now.

Users wouldn't need to deal with the low-level stat_result interface.

barneygale avatar Jan 25 '24 22:01 barneygale

This sounds very interesting for universal_pathlib. Cross-referencing fsspec/universal_pathlib#145

In filesystem_spec this "non os.stat_result returning" method is provided by AbstractFileSystem.info() and its return value is a dictionary with a few standardised keys and unstructured optional information that a filesystem implementation can return.

Apache Airflow provides a wrapper class based on dict that takes this info() dict and provides some of os.stat_result's interface: https://github.com/apache/airflow/blob/8b5a1ff5e9b3ef6941ab932de64dd1c939fcc2fb/airflow/io/utils/stat.py#L22-L83

In internal code I have mostly seen .stat() being used to infer the size or the creation / modification time of files.

ap-- avatar Jan 30 '24 14:01 ap--

Very interesting, thank you!

Out of interest, why not a dataclass rather than a dict with standardised keys? I was leaning towards a dataclass personally. Subclasses (e.g. S3Status) could add their own fields.

And do you think there's any sense in adding a PathBase.set_status/info() method? Other methods like PathBase.chmod() could be made to call it, supplying a Status(permissions=...) object. Users could call it with an S3Status object to set S3-specific metadata.

barneygale avatar Jan 30 '24 14:01 barneygale

Out of interest, why not a dataclass rather than a dict with standardised keys? I was leaning towards a dataclass personally. Subclasses (e.g. S3Status) could add their own fields.

I'm not sure, but my guess would be that this was a pragmatic decision. fsspec.spec.AbstractFileSystem.info() returns a dict since fsspec's first release. There is an open issue in the repo to standardise this across filesystems.

And do you think there's any sense in adding a PathBase.set_status/info() method?

Hmmm, interesting. I'll try to collect some examples for specific filesystems, and will report back.

ap-- avatar Feb 05 '24 13:02 ap--

Resolving this issue, as we have PathInfo in 0.4.0+

barneygale avatar Mar 22 '25 20:03 barneygale