node-unzipper icon indicating copy to clipboard operation
node-unzipper copied to clipboard

Some files fail to unzip regardless of approach

Open pgbce opened this issue 3 years ago • 1 comments

Hello @ZJONSSON, definitely looking for some support with this library. I've scoured all the issues and have seen all sorts of variations on approach with unzipping files from s3.

** A single zipped file that contains many individual files and folders.

*I do want to mention that this is not running within a lambda. *I've tested with multiple zipped files. Any zipped files that are less than 3gb, seem to be unzipped without issues *Zipped files that are 90gb and above, some small files fail but other large files are extracted perfectly fine.

So far, I've worked with approach 1

getObject({})
	.createReadStream.
	.pipe(unzipper.Parse({forceStream:true, verbose:true}))

and with approach 2

await unzipper.Open.s3(s3Client, {})

There seem to be two issues that I face whichever way I go.

  1. I receive OR Z_BUF_ERROR with approach 1
  2. Catch error: Error: invalid stored block lengths with approach 2
at Zlib.zlibOnError [as onerror] (node:zlib:190:17)
errno: -3,
code: 'Z_DATA_ERROR'

*Be warned, my js is a little rusty.

async function runner() {
  try {
    let filePath = myKey.split('.').slice(0, -1).join('.')
    filePath += '/'
    console.log('Working with: ', myKey)
    console.log('File Destination: ', filePath)

    // const stream = s3
    //   .getObject({
    //     Bucket: myBucket,
    //     Key: myKey,
    //   })
    //   .createReadStream()
    //   .pipe(unzipper.Parse({ forceStream: true, verbose: true }))
    let s3Stream = await unzipper.Open.s3(s3, {
      Bucket: myBucket,
      Key: myKey,
    }).catch((error) => {
      console.log('catching:: ', error)
    })

    console.log('counter:: ', s3Stream.files.length)

    for await (let file of s3Stream.files) {
      let fileName = file.path
      let type = file.type
      let fileSize = file.uncompressedSize

      console.log('FileName: ', fileName)
      console.log('FileType: ', type)
      console.log('FileSize: ', fileSize)

      let match = fileName.match('MACOS')
      let res = match !== null ? match.length : null
      console.log('Match: ', res)

      if (type === 'File' && res != 1) {
        let smallerFileSizeCalc = 100 * 1024 * 1024 //100MB

        let params = {
          Bucket: myBucket,
          Key: filePath + fileName,
		  Body: file.stream()
        }
     
        params.ContentLength = fileSize
        
        if (fileSize != 0) {
          if (fileSize <= smallerFileSizeCalc) {           			
            // I've utilized s3.upload as well. 
            s3.putObject(params)
              .promise()
              .then((res) => {
                console.log('pubObject res: ', res)
              })
              .catch((err) => {
                console.log('pubObject err: ', err)
              })
          } else {
            console.log('About to upload')
            console.log('fileName: ', fileName)
            console.log('partSize: ', options.partSize)        

            const upload = new aws.S3.ManagedUpload({
              partSize: '5 * 1024 * 1024 * 1024',
              // queueSize: 1,
              leavePartsOnError: true,
              params: params,
              service: s3,
            })

            upload.on('httpUploadProgress', function (progress) {
              console.log(progress)
            })

            upload.send(function (err, data) {
              if (err) {
                console.log('Sending err: ')
                console.error(err)
              }
              console.log('Data after upload: ', data)
            })
          }
        }
      }
    }
  } catch (error) {
    console.log('Catch error: ', error)
  }
}

runner()

pgbce avatar Sep 27 '21 16:09 pgbce

@ZJONSSON - any chance of providing some insight? Still experiencing issues.

pgbce avatar Feb 07 '22 18:02 pgbce