cargo-binstall icon indicating copy to clipboard operation
cargo-binstall copied to clipboard

Tracking of upstream bug in `tar`

Open NobodyXu opened this issue 2 years ago • 8 comments

https://github.com/alexcrichton/tar-rs/issues/295

tar failed to handle GNUSparseFile correctly.

NobodyXu avatar Jun 11 '22 03:06 NobodyXu

It seems that it would take quite some time for the upstream to fix it.

Since this bug is originally found when downloading cbindgen from cargo-quickinstall, @passcod shall we ask devs of cargo-quickinstall to stop using the gnu sparse file extension for it is not portable?

NobodyXu avatar Jun 13 '22 06:06 NobodyXu

We can try, but is this something particular to qi or could we get that for someone else?

More out there: Are there other tar crates we could evaluate? Or even, how hard would it be to fix ourselves upstream?

passcod avatar Jun 13 '22 06:06 passcod

We can try, but is this something particular to qi or could we get that for someone else?

cargo-binstall would not be able to extract any tarball that uses the gnu sparse file extension.

More out there: Are there other tar crates we could evaluate?

Not sure about that as the feature requires --sparse(-S) flag to be passed to tar to enable and most binaries are unlikely to have large enough holes to be a sparse file.

Besides, this problem is limited to the fetchers only.

fetch_crate_cratesio would never have this problem, since most crates are published using cargo-publish and cargo also uses tar.

Or even, how hard would it be to fix ourselves upstream?

tar seems to have some degree of gnu sparse file extension support:

  • It can recognize a gnu sparse file
  • It can parse the gnu sparse file header

But to extract a sparse file, it still need post-processing logic in xsparse and it also need to fix the filename, e.g. extracting the cbindgen tarball from qi using tar gives GNUSparseFile.0/cbindgen, which the file recognizes as a data file, not elf/executable.

NobodyXu avatar Jun 13 '22 06:06 NobodyXu

Another workaround that we could add is fallback to cargo-install when we detect the presence of GNUSparseFile.*, but that one is really a hack and I'm not sure it will work reliably.

NobodyXu avatar Jun 13 '22 06:06 NobodyXu

It can recognize a gnu sparse file

Can we detect that in code? The main issue I see is that we can do it for when we have a filter (entry.header().as_gnu().sparse) but not for the simple unpack. But overcoming that, it would be a reliable to detect the situation and initiate some kind of fallback.

Beyond failing the entire fetch and falling back to source, we could also try to fallback to using the tar cli tool if present. Then we could have the error say "install tar" instead of / in addition to "tell upstream to not do sparse packing".

passcod avatar Jun 13 '22 10:06 passcod

Also it seems that quick-install is only using normal tar -czf, not the --sparse option? So I'm unsure what caused this package, unless the tar they're using has it on by default or turns it on based on some heuristic.

passcod avatar Jun 13 '22 11:06 passcod

Can we detect that in code? The main issue I see is that we can do it for when we have a filter (entry.header().as_gnu().sparse) but not for the simple unpack. But overcoming that, it would be a reliable to detect the situation and initiate some kind of fallback.

That's doable with #180 , since it provides download_tar_based_and_visit, which enables you to pass a TarEntriesVisitor, but then we have to extract each file by hand.

NobodyXu avatar Jun 13 '22 11:06 NobodyXu

Also it seems that quick-install is only using normal tar -czf, not the --sparse option? So I'm unsure what caused this package, unless the tar they're using has it on by default or turns it on based on some heuristic.

I am genuinely not sure what is happening then.

I tried with both the MacOS 12 finder and bsdtar 3.5.1 - libarchive 3.5.1 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.8 and they are ok.

Only tar gives me the error and it is reproducible.

And I haven't yet noticed another tarball that gives me the same error.

NobodyXu avatar Jun 13 '22 11:06 NobodyXu

Uhhh that just happened again https://github.com/alsuren/cargo-quickinstall/issues/87 , this was discovered in #301 .

NobodyXu avatar Aug 19 '22 09:08 NobodyXu

Hi there. I'm as surprised as everyone else. This certainly wasn't on purpose.

I suspect that fixing it in the quickinstall archives is going to be a bit involved (and would result in holes in the package repo rather than packages that binstall would understand), but I had a crazy idea of how to fix it on the binstall side (if fixing it in the tar crate is too involved):

The tarballs are typically tiny, with a very small number of files, so you could just shell out to tar -xOzf - $bin_name multiple times (see https://www.gnu.org/software/tar/manual/html_node/Writing-to-Standard-Output.html), passing the tarball into stdin and reading the extracted file from stdout.

It's a hack, but it feels like the kind of hack that can be cleanly encapsulated behind a nice API, and it should be portable to all platforms that cargo-quickinstall supports.

alsuren avatar Aug 19 '22 11:08 alsuren

The tarballs are typically tiny, with a very small number of files, so you could just shell out to tar -xOzf - $bin_name multiple times (see https://www.gnu.org/software/tar/manual/html_node/Writing-to-Standard-Output.html), passing the tarball into stdin and reading the extracted file from stdout.

IMHO I don't like this solution, because it adds runtime external dependency to cargo-binstall, which we are striving to avoid.

I would prefer upstream to fix this issue, and it looks like there's already a PR for this https://github.com/alexcrichton/tar-rs/pull/298

NobodyXu avatar Aug 19 '22 11:08 NobodyXu

@passcod It seems that the upstream is especially slow when it comes to PR review. Perhaps we can maintain a fork that contains the fix https://github.com/alexcrichton/tar-rs/pull/298 as a temporary solution.

NobodyXu avatar Sep 09 '22 08:09 NobodyXu

@NobodyXu have you verified that the upstream PR actually fixes the issue?

alsuren avatar Sep 09 '22 08:09 alsuren

@NobodyXu have you verified that the upstream PR actually fixes the issue?

Not yet, I would do a test right now.

NobodyXu avatar Sep 09 '22 08:09 NobodyXu

@alsuren I have tested it here and it indeed fixed the bug.

NobodyXu avatar Sep 09 '22 08:09 NobodyXu

I've created a fork https://github.com/cargo-bins/tar-rs and would create a PR for this.

NobodyXu avatar Sep 10 '22 04:09 NobodyXu

We can close this now that we've got it forked, I think. We'll want to change back once the upstream merges, but it's no longer a bug here.

passcod avatar Sep 10 '22 08:09 passcod