rust-cached-path
rust-cached-path copied to clipboard
Determination of archive format
I see that cached-path currently determines how to extract an archive according to its filename extension:
https://github.com/epwalsh/rust-cached-path/blob/db8cafb061ec1ff561747026f5db4317bfbaff7d/src/archives.rs#L17-L23
The problem that I have is that some archives do not use the expected extension format (in my case, gzipped tarballs are using .tgz rather than .tar.gz). While this could be addressed by expanding/customising the extension list used by cached-path, perhaps it's also an opportunity to consider some alternative approaches:
- HTTP headers (namely
Content-TypeandContent-Encoding); - detection "magic" as per (or via) the
file(1)utility (there's also the magic and bindet crates—the former a wrapper around the libmagic C library and the latter not widely used, but both possibly useful here); or - a user-provided format specifier?
Personally I feel that HTTP headers would be best (if available: obviously not the case for local resources), perhaps falling-back to magic and/or file extensions if no other option is available.
Happy to submit a PR with whatever approach you feel is most suitable for this library, even if only adding .tgz to existing extension list?
Hey @eggyal, I would definitely accept a PR for this. I like the idea of using HTTP headers, so I think that should be the first priority. It would also be nice to allow the user to directly specify the format, so if that's straightforward enough to do in the same PR, please go ahead. I'm not opposed to detection "magic" as a fallback as well... that could always be an optional feature of this crate.