Support TAR file?
I have authored vfs-tar: https://github.com/Berrysoft/vfs-tar
If you would like it, I'd like to author a readonly tar accessor.
Seems interesting! I have considered adding tar support before.
I have a few questions about the implementations:
- Can we accept an
Object(which could be on s3/gcs instead of local fs) as input? so that we can operate tar files on any storages (which is opendal's VISION).- we can use the
ObjectReader's seek support to read file header only instead of the whole content.
- we can use the
- Can you split
tar-parserrelated logic into separate repo?- So
vfs-tarcan share the same parser with opendal?
- So
I'm planning on spliting tar-parser. I have also noticed that nom supports the read-like streams. There should be no big problems to do so.
However, I'd like to support large tar files without large memory usage. That's why I use mmap in vfs-tar now. If Object is used, is this goal still possible?
If
Objectis used, is this goal still possible?
Yes. We only need to scan and store the tar headers which served as index. Content will be fetched while users calling read.
Well, I somehow suspect that. Anyway it worth a try. I will split tar-parser out when I am free and will inform the prgress here.
I will split tar-parser out when I am free and will inform the prgress here.
Thanks in advance!
tar-parser is published: https://crates.io/crates/tar-parser2
vfs-tar is also updated to use it: https://crates.io/crates/vfs-tar
It seems unsuitable for implementation in OpenDAL. Let's close this issue.