node-archiver archive.append(stream) does not call stream.pipe() synchronously

archive.append(stream) does not call stream.pipe() synchronously

Open aalexgabi opened this issue 3 years ago • 1 comments

I use the same stream to do both a direct upload and a zip upload to avoid downloading the file twice:

// Create source stream
const sourceBucketStream = sourceBucket.file('image.png').createReadStream();

// Create long stream
const longStream = fs.createReadStream('big-file');

archive.append(longStream, { name: 'big-file });
archive.append(sourceBucketStream, { name: 'image.png' });
archive.finalize();

await Promise.all([
  promisifiedPipeline(
    sourceBucketStream,
    destinationBucket.file('image.png').createWriteStream(),
  ),
  promisifiedPipeline(
    archive,
    destinationBucket.file('archive.zip').createWriteStream(),
  )
])

I added some debug statements on events of the source streams and the order of events is the following:

sourceBucketStream.pipe() is called by the direct bucket upload
data is passed for the direct upload
sourceBucketStream ends because all data was uploaded
sourceBucketStream.pipe() is called by archiver
streams end immediately because it's already consumed
file is 0 bytes in the archive

sourceBucketStream.pipe() is called later which means that the backpressure on the stream consumption does not start until later and sometimes the image file is uploaded and the entire stream consumed before archiver starts piping and reading the file.

As a workaround now I download the source file twice instead of piping the same stream twice.

The expected behavior is that archiver.append(stream) calls stream.pipe() synchronously to ensure that no data is lost on the stream in case the stream is used by more consumers.

Here is an example: https://stackoverflow.com/questions/19553837/node-js-piping-the-same-readable-stream-into-multiple-writable-targets

Mar 22 '21 12:03 aalexgabi

As a workaround I do this:

      // Make sure pipe is called immediately to avoid losing data https://github.com/archiverjs/node-archiver/issues/510
      archive.append(pipeline([stream, new PassThrough()], (err) => {
        if (err) {
          // Log error here
        }
      }), { name });

Apr 19 '21 13:04 aalexgabi

node-archiver node-archiver copied to clipboard

archive.append(stream) does not call stream.pipe() synchronously

node-archiver
node-archiver copied to clipboard