asu icon indicating copy to clipboard operation
asu copied to clipboard

gunicorn automatically decompresses .gz-files on request

Open AliveDevil opened this issue 2 years ago • 2 comments

It looks like gunicorn/Flask automatically decompresses .gz-files (if the user-agent isn't specifying Accept-Encoding: Identity).

Wrong behavior:

curl -svLO --output-dir ~/Downloads/ http://asu.lab.xc/store/209df15a940d20178f0895ee13f899db/openwrt-23.05.0-7948dc7ce6b9-x86-64-generic-squashfs-combined-efi.img.gz
*   Trying 172.19.216.9:80...
* Connected to asu.lab.xc (172.19.216.9) port 80 (#0)
> GET /store/209df15a940d20178f0895ee13f899db/openwrt-23.05.0-7948dc7ce6b9-x86-64-generic-squashfs-combined-efi.img.gz HTTP/1.1
> Host: asu.lab.xc
> User-Agent: curl/7.88.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Cache-Control: no-cache
< Content-Disposition: inline; filename=openwrt-23.05.0-7948dc7ce6b9-x86-64-generic-squashfs-combined-efi.img.gz
< Content-Type: application/octet-stream
< Date: Thu, 02 Nov 2023 22:52:17 GMT
< Etag: "1698964230.0-20120104-1427253092"
< Last-Modified: Thu, 02 Nov 2023 22:30:30 GMT
< Server: gunicorn
< Transfer-Encoding: chunked
< 

Expected behavior:

curl -H "Accept-Encoding: identity" -svLO --output-dir ~/Downloads/ http://asu.lab.xc/store/209df15a940d20178f0895ee13f899db/openwrt-23.05.0-7948dc7ce6b9-x86-64-generic-squashfs-combined-efi.img.gz
*   Trying 172.19.216.9:80...
* Connected to asu.lab.xc (172.19.216.9) port 80 (#0)
> GET /store/209df15a940d20178f0895ee13f899db/openwrt-23.05.0-7948dc7ce6b9-x86-64-generic-squashfs-combined-efi.img.gz HTTP/1.1
> Host: asu.lab.xc
> User-Agent: curl/7.88.1
> Accept: */*
> Accept-Encoding: identity
> 
< HTTP/1.1 200 OK
< Cache-Control: no-cache
< Content-Disposition: inline; filename=openwrt-23.05.0-7948dc7ce6b9-x86-64-generic-squashfs-combined-efi.img.gz
< Content-Encoding: gzip
< Content-Length: 20120104
< Content-Type: application/octet-stream
< Date: Thu, 02 Nov 2023 22:53:03 GMT
< Etag: "1698964230.0-20120104-1427253092"
< Last-Modified: Thu, 02 Nov 2023 22:30:30 GMT
< Server: gunicorn

On Accept: */* gunicorn/Flask automatically decompresses the static file on the filesystem, which results in wasted bandwidth (compare missing Content-Length, Content-Encoding, and added Transfer-Encoding). This does add strain on the ASU server, which now has to decompress that file inflight.

This behavior is observed with:

  • Chrome
  • curl
  • KGet

Behavior, as expected, is observed with:

  • curl -H "Accept-Encoding: identity"
  • wget

I unfortunately don't have any experience using gunicorn/flask, so I don't have anything more to add here.

AliveDevil avatar Nov 02 '23 23:11 AliveDevil

Sorry I'm confused, that is happening? You're requesting a file and gunicorn is unpacking the files? It should always just "serve" them from my point of view. I'm using gunicorn behind a caddy server so I'm not really into header handling of gunicorn neither.

aparcar avatar Nov 03 '23 09:11 aparcar

Exactly that is happening.

gunicorn serves the gzip compressed files by sending them uncompressed over the wire.

Just wanted to make you aware of that. In the end I put nginx in front and serve static files through that now.

AliveDevil avatar Nov 03 '23 10:11 AliveDevil