node-archiver icon indicating copy to clipboard operation
node-archiver copied to clipboard

Pipe to PassThrough never finalizes

Open atomicpages opened this issue 3 years ago • 4 comments

Hey there, thanks for the awesome library! Found weird behavior piping the archive to a PassThrough stream that's easily reproducible:

For some reason when no data handler is present on the PassThrough stream archive.finalize exits prematurely because self._module never ends.

https://github.com/archiverjs/node-archiver/blob/9af81df7d2165a5691eeda130ff14e49c43c79c3/lib/core.js#L794

I tried on node 14.18.2 and 16.13.0. Here's a smallest repo-able example I was able to create:

const fs = require("fs");
const path = require("path");
const archiver = require("archive");
const Stream = require("stream");

(async function () {
  const pass = new Stream.PassThrough();
  const archive = archiver("zip", { zlib: { level: 0 } });

  archive.on("error", console.error);
  pass.on("error", console.error);

  archive.on("warn", console.warn);

  // uncomment and finalize works as expected
  // pass.on("data", () => null);

  pass.on("close", () => {
    console.log("stream closed");
  });

  archive.pipe(pass);

  const files = []; // path to files on disk

  files.forEach((file) => {
    archive.append(fs.createReadStream(file), { name: path.basename(file) });
  });

  await archive.finalize();
})();

The odd thing is if I have a data handler on the PassThrough stream things work as expected.

Update

Interestingly enough the same behavior is observed without creating a new PassThrough stream

atomicpages avatar Jan 30 '22 23:01 atomicpages

I had a similar issue and setting the highWaterMark helped me out with the issue: const archive = archiver("zip", { zlib: { level: 0 }, highWaterMark: byteSizeOfFile }); where byteSizeOfFile equals the file size of the appended file

esahin90 avatar May 17 '22 14:05 esahin90

I'm having the exact same issue:

I have the following nodejs code, where I try to append readable streams to node-archiver and pipe the archiver to a nodejs PassThrough stream:

  
  const archive = Archiver('zip', {
    zlib: { level: zlib.constants.Z_BEST_SPEED },
    highWaterMark: 10 * 1024 * 1024
  });
  archive.on('error', (error) => {
    logger.error(`archive on error: ${error.name} ${error.code} ${error.message} ${error.path} ${error.stack}`);
    throw new Error(`${error.name} ${error.code} ${error.message} ${error.path} ${error.stack}`);
  });


  const streamPassThrough = new Stream.PassThrough();


  logger.info(`check 5`)

  await new Promise((resolve, reject) => {
    logger.info("Starting upload of the output Files Zip Archive");
    
    logger.info(`check 5.1`)
    s3FilesDownloadSteams.forEach((item) => {
      logger.info(`archive.append: item.fileName: ${item.fileName}`)
      archive.append(item.stream, { name: item.fileName })
    });

    logger.info(`check 5.2`);
    archive.pipe(streamPassThrough);
    logger.info(`check 5.3`)
    // streamPassThrough.on('data', (chunk) => {
    //   console.log(`Received ${chunk.length} bytes of data.`);
    // });
    logger.info(`check 5.4`)
    streamPassThrough.on('close', resolve);
    logger.info(`check 5.5`)
    streamPassThrough.on('end', resolve);
    logger.info(`check 5.6`)
    streamPassThrough.on('error', reject);
    logger.info(`check 5.7`)
    archive.finalize();
    logger.info(`check 5.8`)
  }).catch((error) => {
    logger.error(`Stream flow error: ${error.name} ${error.code} ${error.message} ${error.path} ${error.stack}`);
  });

  logger.info(`check 6`)


However, with the streamPassThrough event handler for 'data' commented out:

    // streamPassThrough.on('data', (chunk) => {
    //   console.log(`Received ${chunk.length} bytes of data.`);
    // });

The code never leave the Promise, check 5.1 to check 5.8 gets printed but check 6 never gets printed.

The weird thing is that if I un-comment out the streamPassThrough event handler for 'data', like this:

    logger.info(`check 5.3`)
    streamPassThrough.on('data', (chunk) => {
       console.log(`Received ${chunk.length} bytes of data.`);
    });
    logger.info(`check 5.4`)

Then it works fine and check 6 is printed.

What is going on here? Why does the streamPassThrough event handler for 'data' affects the code at all?

DaCao avatar Aug 19 '22 10:08 DaCao

Up I have the same issue

jaxalo avatar Mar 03 '23 08:03 jaxalo

Late response but better late than never maybe..

In situations without the data listener, the stream can stay paused. Try calling streampassthrough.resume().

fwiw, I've found in my own use case I needed to manually force the passthrough to close when the data finished piping. You can accomplish this by listening to the archive close event and calling streampassthrough.push(null).

Winters44 avatar Jul 01 '23 18:07 Winters44