laravel-zipstream Error executing "HeadObject" ; AWS HTTP error: cURL error 6: getaddrinfo() thread failed to start

Hello guys!

I'm trying to generate a zip file with several images (about 5000) that are S3. They arrive at a time when a HeadObject error pops even though the file exists, has anyone gone through this?

ERROR: Error executing "HeadObject" on "https://xpto.s3.us-west-2.amazonaws.com/imagem/img-1789ee4b-0040-4f3d-8760-b683506e65a8.jpg"; AWS HTTP error: cURL error 6: getaddrinfo() thread failed to start
 (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://xpto.s3.us-west-2.amazonaws.com/imagem/img-1789ee4b-0040-4f3d-8760-b683506e65a8.jpg {"userId":1,"exception":"[object] (Aws\\S3\\Exception\\S3Exception(code: 0): Error executing \"HeadObject\" on \"https://xpto.s3.us-west-2.amazonaws.com/imagem/img-1789ee4b-0040-4f3d-8760-b683506e65a8.jpg\"; AWS HTTP error: cURL error 6: getaddrinfo() thread failed to start
 (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://xpto.s3.us-west-2.amazonaws.com/imagem/img-1789ee4b-0040-4f3d-8760-b683506e65a8.jpg at /home/ubuntu/apps/vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php:195)

I believe this is happening due to the large amount of files, is there any way to put this with a chunk?

public function downloadFiles(ExportacaoEntrega $exportacao)
{
   
    $zip = Zip::create($exportacao->descricao .'.zip');

        foreach ($exportacao->imagens as $imagem) {

            try {
                $protocolo = 's3://';
                $bucket = env('AWS_BUCKET');
                $filePath = Storage::disk('s3')->path($imagem->source . $imagem->nome);

                $path = $protocolo . $bucket .'/'. $filePath;

                $zip->add($path,  $exportacao->descricao .'/'.$equipamento->id .'/'. $imagem->nome);
                
            }
            catch (\Exception $e) {
                Log::error("unable to read the file at storage path: $imagem->nome and output to zip stream. Exception is " . $e->getMessage());
            }
        }
            
    return $zip;
}

Aug 16 '22 18:08 demesiooliveira

Do you know the size of your files (say, from a database record)? The HeadObject call is attempting to lookup the filesize using the S3 API. If you already have the size you can cut down the number of S3 API calls significantly.

I don't think the S3 API provides any way to get these filesize or retrieve the file contents in chunks.

Aug 16 '22 18:08 jszobody

Você sabe o tamanho de seus arquivos (digamos, de um registro de banco de dados)? A HeadObjectchamada está tentando pesquisar o tamanho do arquivo usando a API do S3. Se você já tiver o tamanho, poderá reduzir significativamente o número de chamadas de API do S3.

Não acho que a API do S3 forneça uma maneira de obter esses tamanhos de arquivo ou recuperar o conteúdo do arquivo em partes.

@jszobody from what I've seen AWS is identifying the amount of request as a DDoS attack so it's refusing the request. I tried using ZIPSTREAM_PREDICT_SIZE=false but I still get the same error. Any idea how to solve it?

Aug 17 '22 12:08 demesiooliveira

Do you know the size of your files

If you are, say, looping through DB records to build this Zip, and you already have the size of each file, then you can eliminate the HEAD lookups entirely.

use STS\ZipStream\Models\File;

// later on in your loop...

$path = $protocolo . $bucket .'/'. $filePath;
$destination = $exportacao->descricao .'/'.$equipamento->id .'/'. $imagem->nome

$zip->add(
    File::make($path, $destination)->setFilesize($imagem->filesize)
);

This assumes of course that your $imagem model has a filesize attribute. If this exists, set it like above, and this package won't need to make 5,000 API calls to S3 to figure out filesize. This should solve your issue.

Aug 17 '22 12:08 jszobody

Hey @jszobody!

I am also experiencing a similar issue, I am looping though items in a database but when I hit the 79's item I get the following exception.

Error executing "HeadObject" on "https://bucket-name.s3.eu-west-2.amazonaws.com/path/to/file"; AWS HTTP error: cURL error 6: Could not resolve host: bucket-name.s3.eu-west-2.amazonaws.com

If I limit the items to 78 the download completes. I have confirmed it is not the 79th file itself causing the issue. As if I delete the file in the exception it moves on to the next, I can also change the offset and it is still the 79th file that triggers the issue.

The looking back through the stack trace the error is happening here.

Unfortunately I do not have the file size to hand to set the Content-Length.

Hopefully not rate limit related and something simple.

Thanks!

Aug 30 '22 20:08 coatezy

@coatezy That error message shows that curl can't connect to S3. That's a low-level issue with your server not being able to resolve and connect, and is way outside anything that this package is doing.

Aug 30 '22 20:08 jszobody

@jszobody This was my initial thought, but the fact it consistently failed on the 79th item made me think it may have been an AWS rate limit issue, although the response did not reflect what I would have expected to see though, and as per the AWS docs these numbers are well within the thresholds.

https://aws.amazon.com/premiumsupport/knowledge-center/s3-503-within-request-rate-prefix/

I'll continue to investigate at a lower level. 👍

Aug 30 '22 20:08 coatezy

@coatezy Are all of your files in the same bucket? Is it possible that the bucket name is different on file 79?

Aug 30 '22 20:08 jszobody

@jszobody It may have been an artisan serve related issue. I set up the app within Sail and I am no longer seeing this issue.

Out of interest, how long should it take for a download to start? With 200 files it is taking approx 35 seconds with or without the ZIPSTREAM_PREDICT_SIZE=false env variable set. I was expecting the stream to begin instantly either way? It's not a db bottleneck as I can return the paths of all files in a couple hundred ms. Specifically trying to use with Heroku to overcome their 30s initial response limit.

Tried setting the X-Accel-Buffering header too as per the ZipStream-PHP wiki. https://github.com/maennchen/ZipStream-PHP/wiki/nginx

Thanks for your help!

Aug 31 '22 09:08 coatezy

Sorry, I know this is off topic, if I prevent zip64 support from being enabled and turn off size prediction the download begins to stream immediately. Wonder if it is worth adding an env option to prevent zip64 if desired. If not directly related to zip64 maybe it is the getFilesize function call within https://github.com/stechstudio/laravel-zipstream/blob/master/src/ZipStream.php#L188.

Aug 31 '22 11:08 coatezy

Out of interest, how long should it take for a download to start?

It should start instantly once it has all the file sizes. This is again where having those in your local DB would be a lifesaver. Otherwise you have 200 API calls to S3 to get all the file sizes, and that slows things down.

Aug 31 '22 12:08 jszobody

I'm going to close this for now, since it seems like it was an artisan serve issue.

Sep 01 '22 20:09 jszobody

laravel-zipstream laravel-zipstream copied to clipboard

Error executing "HeadObject" ; AWS HTTP error: cURL error 6: getaddrinfo() thread failed to start

laravel-zipstream
laravel-zipstream copied to clipboard