archive icon indicating copy to clipboard operation
archive copied to clipboard

Unsupported ZIP compression method (deflation-64-bit)

Open EarlGlynn opened this issue 1 year ago • 0 comments

Reproducible example ...

sourceName <- "D:/IRS/IRS990/XML/2021_TEOS_XML_01A.zip"
expandDir  <- "C:/Users/efg/Desktop/Temp/"
xmlFilename <- archive_extract(sourceName, expandDir)
Error: archive_extract.cpp:21 archive_read_data_block(): **Unsupported ZIP compression method (deflation-64-bit)**

You can download 2021_TEOS_XML_01A.zip (465 MB) from https://apps.irs.gov/pub/epostcard/990/xml/2021/2021_TEOS_XML_01A.zip

The .zip file is from this page: https://www.irs.gov/charities-non-profits/form-990-series-downloads Form 990 Series (e-file) XML format, 2021 files

Windows 10 shows 80,000 files extracted from this zip. Unzipping in Windows is quite slow and appears to run in background but eventually unzips all 80,000 files.

7-zip 23.01 (x64) (https://www.7-zip.org/) uses multiple threads and extracts the 80,000 files in about two minutes:

The IRS introduced these new "TEOS_XML" zip files for years 2021-2023 about a month ago.

All the other .zips on that page (XLM format, years 2015-2019) can be processed with the R base package unzip function. These new "TEOS_XML" zip files fail with unzip with the warning: Warning: internal error in 'unz' code

I really want to use unzip or archive_extract from a Posit Notebook if possible.

EarlGlynn avatar Jun 21 '23 19:06 EarlGlynn