cargo-binstall
cargo-binstall copied to clipboard
Tracking of upstream bug in `tar`
https://github.com/alexcrichton/tar-rs/issues/295
tar
failed to handle GNUSparseFile
correctly.
It seems that it would take quite some time for the upstream to fix it.
Since this bug is originally found when downloading cbindgen
from cargo-quickinstall
, @passcod shall we ask devs of cargo-quickinstall
to stop using the gnu sparse file extension for it is not portable?
We can try, but is this something particular to qi or could we get that for someone else?
More out there: Are there other tar crates we could evaluate? Or even, how hard would it be to fix ourselves upstream?
We can try, but is this something particular to qi or could we get that for someone else?
cargo-binstall
would not be able to extract any tarball that uses the gnu sparse file extension.
More out there: Are there other tar crates we could evaluate?
Not sure about that as the feature requires --sparse
(-S
) flag to be passed to tar
to enable and most binaries are unlikely to have large enough holes to be a sparse file.
Besides, this problem is limited to the fetchers only.
fetch_crate_cratesio
would never have this problem, since most crates are published using cargo-publish
and cargo
also uses tar
.
Or even, how hard would it be to fix ourselves upstream?
tar
seems to have some degree of gnu sparse file extension support:
- It can recognize a gnu sparse file
- It can parse the gnu sparse file header
But to extract a sparse file, it still need post-processing logic in xsparse
and it also need to fix the filename, e.g. extracting the cbindgen
tarball from qi using tar
gives GNUSparseFile.0/cbindgen
, which the file
recognizes as a data file, not elf/executable.
Another workaround that we could add is fallback to cargo-install
when we detect the presence of GNUSparseFile.*
, but that one is really a hack and I'm not sure it will work reliably.
It can recognize a gnu sparse file
Can we detect that in code? The main issue I see is that we can do it for when we have a filter
(entry.header().as_gnu().sparse
) but not for the simple unpack
. But overcoming that, it would be a reliable to detect the situation and initiate some kind of fallback.
Beyond failing the entire fetch and falling back to source, we could also try to fallback to using the tar
cli tool if present. Then we could have the error say "install tar
" instead of / in addition to "tell upstream to not do sparse packing".
Also it seems that quick-install is only using normal tar -czf
, not the --sparse
option? So I'm unsure what caused this package, unless the tar they're using has it on by default or turns it on based on some heuristic.
Can we detect that in code? The main issue I see is that we can do it for when we have a filter (entry.header().as_gnu().sparse) but not for the simple unpack. But overcoming that, it would be a reliable to detect the situation and initiate some kind of fallback.
That's doable with #180 , since it provides download_tar_based_and_visit
, which enables you to pass a TarEntriesVisitor
, but then we have to extract each file by hand.
Also it seems that quick-install is only using normal
tar -czf
, not the--sparse
option? So I'm unsure what caused this package, unless the tar they're using has it on by default or turns it on based on some heuristic.
I am genuinely not sure what is happening then.
I tried with both the MacOS 12 finder and bsdtar 3.5.1 - libarchive 3.5.1 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.8
and they are ok.
Only tar
gives me the error and it is reproducible.
And I haven't yet noticed another tarball that gives me the same error.
Uhhh that just happened again https://github.com/alsuren/cargo-quickinstall/issues/87 , this was discovered in #301 .
Hi there. I'm as surprised as everyone else. This certainly wasn't on purpose.
I suspect that fixing it in the quickinstall archives is going to be a bit involved (and would result in holes in the package repo rather than packages that binstall would understand), but I had a crazy idea of how to fix it on the binstall side (if fixing it in the tar crate is too involved):
The tarballs are typically tiny, with a very small number of files, so you could just shell out to tar -xOzf - $bin_name
multiple times (see https://www.gnu.org/software/tar/manual/html_node/Writing-to-Standard-Output.html), passing the tarball into stdin and reading the extracted file from stdout.
It's a hack, but it feels like the kind of hack that can be cleanly encapsulated behind a nice API, and it should be portable to all platforms that cargo-quickinstall supports.
The tarballs are typically tiny, with a very small number of files, so you could just shell out to tar -xOzf - $bin_name multiple times (see https://www.gnu.org/software/tar/manual/html_node/Writing-to-Standard-Output.html), passing the tarball into stdin and reading the extracted file from stdout.
IMHO I don't like this solution, because it adds runtime external dependency to cargo-binstall
, which we are striving to avoid.
I would prefer upstream to fix this issue, and it looks like there's already a PR for this https://github.com/alexcrichton/tar-rs/pull/298
@passcod It seems that the upstream is especially slow when it comes to PR review. Perhaps we can maintain a fork that contains the fix https://github.com/alexcrichton/tar-rs/pull/298 as a temporary solution.
@NobodyXu have you verified that the upstream PR actually fixes the issue?
@NobodyXu have you verified that the upstream PR actually fixes the issue?
Not yet, I would do a test right now.
@alsuren I have tested it here and it indeed fixed the bug.
I've created a fork https://github.com/cargo-bins/tar-rs and would create a PR for this.
We can close this now that we've got it forked, I think. We'll want to change back once the upstream merges, but it's no longer a bug here.