choco icon indicating copy to clipboard operation
choco copied to clipboard

Change Install-ChocolateyZipPackage to fully extract .tar.xz archives

Open alexchandel opened this issue 9 years ago • 11 comments

It was recently suggested that Install-ChocolateyZipPackage only produces a .tar file when called on a .tar.xz, and that chocolatey#473 from the old repository was intended to correct this. This should be implemented.

alexchandel avatar Aug 17 '15 17:08 alexchandel

I will tackle this one.

DarwinJS avatar Oct 19 '16 09:10 DarwinJS

@ferventcoder - after looking at the code I noticed there is a lot of code just to get reasonable process monitoring and status back from 7zip.

So I am contemplating having Get-ChocolateyUnzip.ps1 call itself recursively for a second pass at the file since all that logic would be reused for the second pass.

Some questions / observations for your comment:

  • Does a recursive call seem like a reasonable approach given all you know about Get-ChocolateyUnzip?
  • If I'm thinking right, the checksum of the .tar.gz would already have been checked - so should not have any problems that the enclosed .tar does not have a checksum.
  • I believe the same code could handle all extensions that result in a .tar on the first pass that 7zip handles (e.g. tar.gz, .tgz, .tar.bz2)
  • from my reading about doing this: trying to avoid an intermediary .tar on disk can lead to memory problems on large archives - so thinking of just writing the file for highest compatibility.
  • thinking of just letting the .tar write to the -destination parameter (first pass is business as usual) and then detecting it was a .tar and recursing get-chocolateyzip back to the same location again
  • I am thinking the intermediary .tar should be deleted since it is an intermediary artifact that just consumes disk?

DarwinJS avatar Oct 19 '16 09:10 DarwinJS

I think something like this code at the end of the function might do it?

  • New param -recursivecall added cautionarily so that it can be referenced on secondary calls to avoid infinite loops or to enhance or suppress messaging. (for instance if the recursed call errors, it would be good to know you are recursed)
  • Last Three lines of sample are original code to show the placement of this code.
  • only looks for a .tar that matches first part of our original archive without extensions - to avoid unzipping embedded tars that have nothing to do with our problem. Would not work with edge cases where the embedded tar name does not match the enclosing archive name (where the embedded tar is the only file) - but that seems more edge case than accidentally picking up and extracting embedded tars that are not the direct result of a .tar.gzip type archive.
  • this method of detection should be somewhat future proof by not having to check all known compound archive extensions currently or in future supported by 7zip (e.g. ".tar.gzip", ".tgz", ".tar.b2z", etc). - it just checks for a resultant archive that matches our file spec without extension(s) and processes it if it exists.
  • limited to one recursion or else we'd have an infinite loop.
  • I can't figure out if the logic to avoid regular return actions on the recursive call is ill advised => [ If (!$RecursiveCall) ]

  #If our unzip resulted in a .tar file that matches our input zip file without it's extensions, unzip it
  #does not work with archive names that contain a '.' before the extensions
  $matchingtarname = "$destinationNoRedirection\$((split-path -leaf $fileFullPathNoRedirection).split('.')[0]).tar"
  If ((Test-path $matchingtarname) -AND (!$RecursiveCall))
  {
    Write-Host "Automatically detected this archive contained a .tar matching the original archive name, unzipping `"$matchingtarname`" ..."
    Get-ChocolateyUnzip -fileFullPath "$matchingtarname" -destination "$destination" -specificFolder "$specificFolder" -packagename "$packagename" -recursivecall
    remove-item $matchingtarname
  }

 If (!$RecursiveCall)
  {
  #Untouched end of function code below to show position of above
  $env:ChocolateyPackageInstallLocation = $destination
  return $destination
  }
}```

DarwinJS avatar Oct 19 '16 10:10 DarwinJS

OK - my proposal for conservatively determining the embedded archive name does not work because Get-ChocolateyWebFile or Get-WebFile rename the fetched archive to [packageid][install].extension.

@ferventcoder - is there a variable available to me (maybe from a parent or global scope) while I am in Get-ChocolateyUnzip.ps1 that would reveal the original archive name?

If not, then we'd have to do something like unzip a found *.tar in ONLY the root of \tools back to the root of \tools. Maybe we limit the logic to do this ONLY if we find a SINGLE *.tar at the root of \tools (leading to the hopeful conclusion that the found *.tar was the only contents of the original archive) ?

DarwinJS avatar Oct 22 '16 15:10 DarwinJS

@ferventcoder - I dumped all variables from within Get-ChocolateyUnzip and I see that the original file name appears in both URL and ZipFileList.

  1. Which of these would be best to use given my objective?
  2. "ZipFileList" seems to imply that somewhere in the upbound call stack more than one zip may have been specified. Should I look for multiple .tars based on the zip file list in case multiple *.tar.gz files have been specified in the current run?

I have the code working against $url - but easy to change.

Since I have a working sample, I submitted the pull request so you can see the working code - can make any changes needed.

https://github.com/chocolatey/choco/pull/1026

DarwinJS avatar Oct 23 '16 14:10 DarwinJS

Definitely interested in seeing this implemented. I find the Java .tar.gz and so many others much easier to work with than handling a self extracting .exe that doesn't take parameters and then figuring out where it put things just to move them where I actually wanted them.

dragon788 avatar Oct 24 '16 21:10 dragon788

@ferventcoder - I had this code fail a test recently - it was acting like $url was not available in that scenario. Should "ZipFileList" be available in every scenario where Get-ChocolateyUnzip is called?

DarwinJS avatar Oct 28 '16 10:10 DarwinJS

@DarwinJS not completely sure on that. The code would be the best place to know.

ferventcoder avatar Nov 13 '16 13:11 ferventcoder

I originally found $url by dumping all variables from within the function. I have switched to $zipfilelist and left the function running on my machine for some weeks now and no problems.

So I check if the variable exists and also if it contains data - if so I know we can at least formulate a file name to check for the existence of a *.tar. If somehow $zipfilelist is sometimes populated when we aren't processing a zip file - no problem, the existence check simply fails and no one is the wiser.

DarwinJS avatar Nov 13 '16 19:11 DarwinJS

I have setup automatic packages for both Kafka and prometheus but both of these packages depend on this feature working as they are either tar.gz or tgz files which contain a tar that doesn't current get extracted.

I would rather not write specialized code for these packages and instead rely on Install-ChocolateyZipPackage to handle this.

ChrisMagnuson avatar Feb 14 '17 20:02 ChrisMagnuson

Used the technique of running get-chocolateyunzip function after the initial Install-ChocolateyZipPackag and things are now working. Example. It would still be nice to have this built in but we are not blocked on this any longer.

ChrisMagnuson avatar Feb 23 '17 20:02 ChrisMagnuson