mod_zip icon indicating copy to clipboard operation
mod_zip copied to clipboard

Invalid archive produced when upstream connection is reset

Open Maloffstrano opened this issue 8 years ago • 3 comments

I'm wondering if anyone can help answer this question; there seems to be limited information about it on web in general. We have a ruby back-end producing the correct manifest for mod_zip. The files themselves reside on AWS S3 and are a mix of image stills and motion files, which can get quite large.

Give this setup mod_zip will produce an invalid archive as often as it produces a valid archive.

I compiled debug mode into NGINX 1.10.1 along with the latest version of mod_zip module.

From what I can gleam about proxying files from an upstream, our manifest looks like /amazons3/path/to/file.suffix. The relevant bit of the NGINX config looks like:

  location ~ "^/(amazons3)/(?<file>.*)" {
    resolver 8.8.8.8 8.8.4.4; # Google public DNS

    set $s3_domain         s3.amazonaws.com;

    proxy_http_version     1.1;
    proxy_set_header       Host $s3_domain;
    proxy_set_header       Authorization '';

    # We DO NOT want gzip content returned to us.  
    # See: https://github.com/evanmiller/mod_zip/issues/21
    #
    # Compiled --with-debug and debug logging turned on, we were 
    # sending to S3: Accept-Encoding: gzip, deflate
    #
    proxy_set_header       Accept-Encoding '';   

    proxy_hide_header      x-amz-id-2;
    proxy_hide_header      x-amz-request-id;
    proxy_hide_header      x-amz-server-side-encryption;

    proxy_hide_header      Set-Cookie;
    proxy_ignore_headers   "Set-Cookie";
    proxy_intercept_errors on;
    proxy_buffering        off; 
    
    # NOTE HTTP not HTTPS because mod_zip resolves domain to ip address
    proxy_pass http://$s3_domain/bucket/$file;
  }

After seeing a number of [error] recv() failed (104: Connection reset by peer) while sending to client in the NGINX error.log, I decided to turn on debug to see what was going on. The relevant bit of the log file looks like this:


[debug] recv: fd:12 4096 of 4096
[debug] http output filter "/amazons3/path/to/file_1.mov?"
[debug] http copy filter: "/amazons3/path/to/file_1.mov?"
[debug] mod_zip: entering subrequest body filter

[debug] mod_zip: No range for subrequest to satisfy

[debug] http postpone filter "/amazons3/path/to/file_1.mov?" 00000000022CD1A8
[debug] write new buf t:0 f:0 0000000000000000, pos 00000000022CE260, size: 4096 file: 0, size: 0
[debug] http write filter: l:0 f:1 s:4096
[debug] http write filter limit 0
[debug] SSL buf copy: 4096
[debug] SSL to write: 4096
[debug] SSL_write: 4096
[debug] http write filter 0000000000000000
[debug] http copy filter: 0 "/amazons3/path/to/file_1.mov?"

[debug] recv: fd:12 4096 of 4096

[debug] http output filter "/amazons3/path/to/file_1.mov?"
[debug] http copy filter: "/amazons3/path/to/file_1.mov?"
[debug] mod_zip: entering subrequest body filter
[debug] mod_zip: No range for subrequest to satisfy
[debug] http postpone filter "/amazons3/path/to/file_1.mov?" 00000000022CD1A8
[debug] write new buf t:0 f:0 0000000000000000, pos 00000000022CE260, size: 4096 file: 0, size: 0
[debug] http write filter: l:0 f:1 s:4096
[debug] http write filter limit 0
[debug] SSL buf copy: 4096
[debug] SSL to write: 4096
[debug] SSL_write: 4096
[debug] http write filter 0000000000000000
[debug] http copy filter: 0 "/amazons3/path/to/file_1.mov?"
[debug] recv: fd:12 2800 of 4096
[debug] http output filter "/amazons3/path/to/file_1.mov?"
[debug] http copy filter: "/amazons3/path/to/file_1.mov?"
[debug] mod_zip: entering subrequest body filter
[debug] mod_zip: No range for subrequest to satisfy
[debug] http postpone filter "/amazons3/path/to/file_1.mov?" 00000000022CD1A8
[debug] write new buf t:0 f:0 0000000000000000, pos 00000000022CE260, size: 2800 file: 0, size: 0
[debug] http write filter: l:0 f:1 s:2800
[debug] http write filter limit 0
[debug] SSL buf copy: 2800
[debug] SSL to write: 2800
[debug] SSL_write: 2800
[debug] http write filter 0000000000000000
[debug] http copy filter: 0 "/amazons3/path/to/file_1.mov?"
[debug] recv: fd:12 -1 of 4096


[error] recv() failed (104: Connection reset by peer) while sending to client, client: 999.999.999.999, server: _, request: "POST /streaming-zipper HTTP/1.1", subrequest: "/amazons3/path/to/file_1.mov", upstream: "http://54.231.114.116:80/bucket/path/to/file_1.mov", host: "sts.d.c", referrer: "https://sts.d.c/streaming-zipper"


[debug] finalize http upstream request: 502
[debug] finalize http proxy request
[debug] free rr peer 1 0
[debug] close http upstream connection: 12
[debug] free: 0000000002247600, unused: 48
[debug] reusable connection: 0
[debug] http output filter "/amazons3/path/to/file_1.mov?"
[debug] http copy filter: "/amazons3/path/to/file_1.mov?"
[debug] mod_zip: entering subrequest body filter

[debug] mod_zip: No range for subrequest to satisfy

[debug] http postpone filter "/amazons3/path/to/file_1.mov?" 00007FFFCE7BC590
[debug] write new buf t:0 f:0 0000000000000000, pos 0000000000000000, size: 0 file: 0, size: 0
[debug] http write filter: l:0 f:1 s:0
[debug] http copy filter: 0 "/amazons3/path/to/file_1.mov?"
[debug] http finalize request: 0, "/amazons3/path/to/file_1.mov?" a:1, c:2
[debug] http wake parent request: "/streaming-zipper?"
[debug] http posted request: "/streaming-zipper?"
[debug] http writer handler: "/streaming-zipper?"
[debug] http output filter "/streaming-zipper?"
[debug] http copy filter: "/streaming-zipper?"
[debug] mod_zip: entering main request body filter

[debug] mod_zip: restarting subrequests

[debug] mod_zip: sending pieces, starting with piece 8 of total 15
[debug] mod_zip: no ranges / sending piece type 0
[debug] http postpone filter "/streaming-zipper?" 00000000022CD198
[debug] write new buf t:0 f:0 0000000000000000, pos 00000000022CF5E0, size: 99 file: 0, size: 0
[debug] http write filter: l:0 f:1 s:99
[debug] http write filter limit 0
[debug] SSL buf copy: 99
[debug] SSL to write: 99
[debug] SSL_write: 99
[debug] http write filter 0000000000000000
[debug] mod_zip: no ranges / sending piece type 1
[debug] mod_zip: subrequest for "/amazons3/path/to/file_2.jpg?"
[debug] mod_zip: have a wait context for "/amazons3/path/to/file_1.mov?"
[debug] mod_zip: wait "/amazons3/path/to/file_1.mov?" done
[debug] http subrequest "/amazons3/path/to/file_2.jpg?"
[debug] mod_zip: subrequest for "/amazons3/path/to/file_2.jpg?" result 0, allocating some mem on main request's pool
[debug] mod_zip: subrequest for "/amazons3/path/to/file_2.jpg?" result 0
[debug] mod_zip: sent 2 pieces, last rc = -2
[debug] http copy filter: -2 "/streaming-zipper?"
[debug] http writer output filter: -2, "/streaming-zipper?"
[debug] event timer: 3, old: 1481918834155, new: 1481918834368
[debug] http posted request: "/amazons3/path/to/file_2.jpg?"
[debug] rewrite phase: 0
[debug] test location: "/"
[debug] test location: ~ "\.(php|aspx?|jsp|cfm|go)$"
[debug] test location: ~ "^/(amazons3)/(?<file>.*)"
[debug] http regex set $file to "path/to/file_2.jpg"
[debug] using configuration "^/(amazons3)/(?<file>.*)"

What I gather from reading this debug trace is that the peer, S3, is spontaneously closing the request to NGINX, which treats that as a 502. NGINX finalizes that sub-request for file_1.mov and then immediately moves on to the next file in the manifest, file_2.jpg.

I'm not sure why S3 is prematurely closing the request, but it would appear that mod_zip doesn't try to resume the rest of file_1.mov because, for whatever reason, it isn't asking for ranges from S3 to begin with.

So this brings me to the question, can mod_zip properly work with streaming really large files from S3? Is there something incorrect in the NGINX location /amazons3/ directive? (The config is pieced together from a number of articles on the web.) OR should we be downloading the files to the NGINX server and then zip the files from there?

If anybody has any knowledge about mod_zip and AWS S3, I'd appreciate you're insight.

Thanks. Sheldon

Maloffstrano avatar Dec 16 '16 21:12 Maloffstrano

Hi, it looks like you've answered your own question! mod_zip does not have very intelligent retry logic, so remote upstreams should be used with care. I will rename the issue to reflect the underlying bug.

evanmiller avatar Oct 07 '17 11:10 evanmiller

Note to self: I believe the ngx_http_post_subrequest_t callback will come in handy here. We want some logic like:

  • If a subrequest fails, try again at least once
  • If a subrequest aborts early, try resuming it using a Range subrequest
  • If a subrequest continues to fail, return an error to the client

I might split this issue out into a bug and a feature. (Fixing the bug means returning an error in the event of failure; the feature would be adding retry logic of some kind.)

evanmiller avatar Oct 07 '17 14:10 evanmiller

Hi. I've returned to this thread after a excessive time off from this problem. I've looked over the README and seen another project use this one (https://github.com/irasnyd/nginx_s3_streaming_zip).

I'm curious, did the bug and the feature ever get resolved? Are errors from S3 properly retried now? Or does the entire ZIP file get aborted (which I gleamed from the README is what actually happens).

Maloffstrano avatar Nov 12 '19 17:11 Maloffstrano