Decompress ".Z" file (LZW compression)
Thanks for the amazingly useful package! ❤️
Description of the problem
I want to download a file that is compressed with the extension .Z: ftp://ftp.noc.soton.ac.uk/pub/sxj/clim/netcdf/hfns11a.nc.Z
I can download and decompress it fine on the command line using gunzip
curl -O ftp://ftp.noc.soton.ac.uk/pub/sxj/clim/netcdf/hfns11a.nc.Z
gunzip hfns11a.nc.Z
My best attempt with pooch is
fname = pooch.retrieve(
'ftp://ftp.noc.soton.ac.uk/pub/sxj/clim/netcdf/hfns11a.nc.Z',
known_hash='4820bce249ce508642762764fa0daa9c4785d42524fc603fe945d25140c3eaad',
processor=pooch.Decompress(method='LZMA')
)
any other methods raise errors. However, I don't think LZMA is the right decompressor. This data is compressed using LZW.
I have figured out how to decompress it in python using the unlzw3 package.
import unlzw3
with open("hfns11a.nc.Z", "rb") as fp:
uncompressed_data = unlzw3.unlzw(fp.read())
with open("uncompressed.nc", "wb") as fp:
fp.write(uncompressed_data)
I have verified that this matches the results of gunzip
Would there be interest in adding this compressor to Pooch?
👋 Thanks for opening your first issue here! Please make sure you filled out the template with as much detail as possible.
You might also want to take a look at our Contributing Guide and Code of Conduct.
@rabernat that is definitely something that would be of interest! Is this something you'd want to work on?
Since it requires an extra dependency, it would be best if it's made optional. There are plenty of examples of how to implement and test this in the code base (for example, the SFTP and TQDM support) but I'd be happy to provide more specific guidance and help.