opendal icon indicating copy to clipboard operation
opendal copied to clipboard

new feature: suggestions for `Lister`

Open chitralverma opened this issue 6 months ago • 1 comments

Feature Description

Thought of a few Lister and Entry enhancements for your consideration. (Affects public API)

Problem and Solution

  • stat returns Metadata which is okay, and Lister returns an Entry which is path +Metadata. But since Lister doesn't guarantee metadata retrieval due to performance implications, and Entry should be path +Option<Metadata> instead.
  • Some user might still be ready for those performance cost, and for this ListOptions should have a key with_metadata: bool which ensures the metadata retrieval. This way the user decides performance vs results. (they can still do this with an external stat call but user side code simplifies with this simple new option).
  • Additionally, stat already has some options which might be useful in Lister as well (can be pushed down, if backend services support it),
    • if_match
    • if_none_match
    • if_modified_since
    • if_unmodified_since
  • Suggestions for better naming,
    • deleted -> with_deleted
    • versions -> with_versions

what do you think?

Additional Context

No response

Are you willing to contribute to the development of this feature?

  • [x] Yes, I am willing to contribute to the development of this feature.

chitralverma avatar Jun 09 '25 07:06 chitralverma

Thank you @chitralverma for raising this.

Metadata is always available, as we will provide at least the file type. Changing it to Option<Metadata> doesn't make sense to me.

The more complex issue here is that different services may return different sets of metadata. For example, S3 might have an etag but not a content_type. So even if we return it as Option<Metadata>, users will still have to handle each metadata field on a case-by-case basis.

Some user might still be ready for those performance cost, and for this ListOptions should have a key with_metadata: bool which ensures the metadata retrieval. This way the user decides performance vs results. (they can still do this with an external stat call but user side code simplifies with this simple new option).

We had this before. We designed a complex system specifically for this purpose, but in reality, it doesn't simplify things. Having list overlap with stat is not a good idea. We discussed this in opendal::docs::rfcs::rfc_5314_remove_metakey.

(can be pushed down, if backend services support it),

No services can push this down, except perhaps for certain in-memory structures. It might be easier for us to use a filter_map externally.

Suggestions for better naming, deleted -> with_deleted versions -> with_versions

It seems good idea to me to change them into includes_deleted and includes_versions. How about create a seperate issue for this?

I picked up deleted and versions before since our API usage will look like:

let res = op.list_with(path).deleted(true).versions(true).await?;

Will this flavor cause confusion about our behavior?

Xuanwo avatar Jun 09 '25 10:06 Xuanwo