suggestion: impl debug for reader, writer and more metadata
Feature Description
@Xuanwo i was working the side to prepare a PR which allows opendal as an external byte source for Polars.
While doing this i came across a few limitation which need Debug to be implemented for Reader and Writer. For now I have implemented these on my end directly, but i was thinking if these can be implemented directly in core since ReadContext is private, maybe some non critical metadata can be exposed?
Problem and Solution
With this I also have the following suggestions,
- When reader is created, maybe a stat call can be triggered eagerly which exposes the
Metadataobject as a field in reader. The reader can fail eagerly if the object doesn't exist on this basis. - as i remember, this eager call is anyways triggered when user passes byte range as
.., so might as well do this beforehand. - Metadata has a lot of info already but some things like scheme, prefix, and fully qualified path are still missing, but they're available in the
ReaderContext.
What do you think?
Additional Context
Ref: https://github.com/apache/opendal/discussions/5972
Are you willing to contribute to the development of this feature?
- [ ] Yes, I am willing to contribute to the development of this feature.
Hi, @chitralverma, thanks a lot for working on this.
While doing this i came across a few limitation which need Debug to be implemented for Reader and Writer.
The requirement is from polars? I'm open to implement Debug for Reader and Writer and maybe expose some meaningful internal states.
When reader is created, maybe a stat call can be triggered eagerly which exposes the Metadata object as a field in reader. The reader can fail eagerly if the object doesn't exist on this basis.
Maybe we can have a new option for this, but I'm not sure if it's a good idea.
- as i remember, this eager call is anyways triggered when user passes byte range as
.., so might as well do this beforehand.
The stat call only occurs when users input both .. and chunk/range simultaneously. This means that if users want to read from the beginning to the end, no stat call is triggered.
Metadata has a lot of info already but some things like scheme, prefix, and fully qualified path are still missing, but they're available in the ReaderContext.
They can be retrieved from op.info(), so having a duplicate in Metadata seems a bit redundant to me.