node-archiver icon indicating copy to clipboard operation
node-archiver copied to clipboard

Error extracting ZIP

Open paul-zadorozhniy opened this issue 7 years ago • 8 comments

Getting an error while extracting zip file on Mac and Windows

Mac error screen shot 2017-08-25 at 1 43 52 pm

Windows error screen shot 2017-08-25 at 1 44 58 pm

looks like it happens only with large files (I have 5gb file zipped up inside) and for smaller works good.

    this.outputSource = path.normalize('${__dirname}/../../../${new Date().getTime()}.zip');
    this.output = fs.createWriteStream(this.outputSource);

    this.zip = archiver('zip', { zlib: { level: 9 } });

    this.output.on('close', () => {
      logger.info('${this.course.guid} material ${this.zip.pointer()} total bytes');
      logger.info('${this.course.guid} material archiver has been finalized and the output file descriptor has closed.');
    });

    this.zip.on('warning', err => logger.error(err));
    this.zip.on('error', err => logger.error(err));

    this.zip.pipe(this.output);

    zipStream.directory('${this.tmpFolder}/', false);
    zipStream.finalize();

paul-zadorozhniy avatar Aug 25 '17 14:08 paul-zadorozhniy

likely a zip64 issue but its trickier to test. some builtin zip tools don't support zip64 as well.

ctalkington avatar Aug 25 '17 23:08 ctalkington

I'm seeing the same issue after running node-archiver on a directory that contains a file larger than 4 GB in size. Attempting to open the ZIP on Windows results in the error above. If I attempt to extract via 7Zip, I'm able to see the contents but the file > 4GB shows as being 1 byte in size:

image

The original ZIP is 4.19 GB. Attempting to extract the contents to a folder results in errors:

1   C:\...\foo.zip
    Headers Error
    Unconfirmed start of archive
    Warnings:
    There are some data after the end of the payload data
2   CRC failed : file.txt

Both files are extracted with the sizes listed above; the 4 GB file (which is a video file) results in a 1 byte file that cannot be played. I see this error whether or not forceZip64 is set or not on the archiver.

EDIT: If it helps, the process using node-archiver runs on Debian Jessie.

camlegleiter avatar Oct 04 '17 22:10 camlegleiter

Something I'm wondering as well is if this isn't related to something with zlib under the hood. From the zlib FAQ:

32. Can zlib work with greater than 4 GB of data?

Yes. inflate() and deflate() will process any amount of data correctly. Each call of inflate() or deflate() is limited to input and output chunks of the maximum value that can be stored in the compiler's "unsigned int" type, but there is no limit to the number of chunks. Note however that the strm.total_in and strm_total_out counters may be limited to 4 GB. These counters are provided as a convenience and are not used internally by inflate() or deflate(). The application can easily set up its own counters updated after each call of inflate() or deflate() to count beyond 4 GB. compress() and uncompress() may be limited to 4 GB, since they operate in a single call. gzseek() and gztell() may be limited to 4 GB depending on how zlib is compiled. See the zlibCompileFlags() function in zlib.h.

The word "may" appears several times above since there is a 4 GB limit only if the compiler's long type is 32 bits. If the compiler's long type is 64 bits, then the limit is 16 exabytes.

If the version of zlib that Node uses isn't compiled to set its long type to 64 bit, then the max size of data is 4 GB, at least based on this answer.

camlegleiter avatar Oct 04 '17 23:10 camlegleiter

image

Attached is the difference when running the yazl module with ZIP64 forced (top) and node-archiver with ZIP64 forced (bottom). I apologize that the first file is different and affects some of the values. However, the video file (which is 4.2 GB) has the same CRC32 value in both cases.

Is node-archiver correctly using ZIP64 as expected when forceZip64: true? When I dug through the code it wasn't clear to me where it was used down in node-compression-commons.

camlegleiter avatar Oct 05 '17 19:10 camlegleiter

it does look like we should be setting a different min version to extract here. working through some other compatibility fixes in node-compress-commons. would love to have you test once i push out a minor release this weekend to see if you still see the same behavior.

ctalkington avatar Oct 07 '17 20:10 ctalkington

Sure. Just give me a heads up when you have something pushed out and I'll play around with it locally.

camlegleiter avatar Oct 13 '17 15:10 camlegleiter

Hi. personally i forgot to finalize the archive... archive.finalize(); fixed the unzipping error on mac. thanks for this great tool !

Arthy74 avatar Oct 28 '17 14:10 Arthy74

As I understand, in order to zip large files (> 4GB) zip64 needs to be used. This can be applied by setting forceZip64 to true which will be passed to zlib. However, this setting will not be respected if no compression is used, since zlib will not be used in this case. Thus it seems impossible to use zip64 in combination with no compression.

These lines in the zip-stream module seems to cause the problem https://github.com/archiverjs/node-zip-stream/blob/1d479dd9e68e66463769be03809ce3a1e6cf400d/index.js#L41:

  if (typeof options.zlib.level === 'number' && options.zlib.level === 0) {
    options.store = true;
  }

After removing this check it was possible, at least on Mac, to unzip archives larger than 10 GB. Before removing the check it was not possible.

Suggested solution

If forceZip64 is set to true, store should not be set to true.

  if (!options.forceZip64 && typeof options.zlib.level === 'number' && options.zlib.level === 0) {
    options.store = true;
  }

afogelberg avatar Oct 14 '19 17:10 afogelberg