echo icon indicating copy to clipboard operation
echo copied to clipboard

gzip response only if it exceeds a minimal length

Open ioppermann opened this issue 2 years ago • 5 comments

If the response is too short, e.g. a few bytes, compressing the response makes it even larger. The new parameter MinLength to the GzipConfig struct allows to set a threshold (in bytes) as of which response size the compression should be applied. If the response is shorter, no compression will be applied.

ioppermann avatar Sep 09 '22 13:09 ioppermann

I think this will break http.Flusher interface logic. If you partially write response to the client and that part is smaller than limit and now call flush - what will happen? What status client will see for flushes response as it seems that func (w *gzipResponseWriter) WriteHeader(code int) { has been changed. If you flush before hitting the limit and continue calling write and assuming that response was flushed and written to the client without zipping it - will gzipper kick in after limit has reached?

If GzipConfig.MinLength is to bee number it needs to have comments at which size it makes sense to use this feature. Writing 1 byte from handler gives me 26 byte of gzipped data. https://en.wikipedia.org/wiki/Gzip#File_format says that gzip header is 10 bytes and footer 8 bytes + data + some something else. so minimum is probably something around 19-20 bytes? Does it even make sense to set it larger number than 20?

aldas avatar Sep 09 '22 18:09 aldas

Also thread about https://webmasters.stackexchange.com/questions/31750/what-is-recommended-minimum-object-size-for-gzip-performance-benefits

aldas avatar Sep 09 '22 18:09 aldas

@aldas Thanks for your thoughts.

For the http.Flusher it must probably enforce a gzip'ed response (ignoring the threshold) in order not to break the interface logic.

Regarding the threshold for compressing the response, I would say it depends on the application. For an API with JSON responses, a lower threshold or no threshold at all might be appropriate. But if an API has many empty responses (or a simple OK), then compressing the response will make it bigger and consumes more CPU (on the server and client) and increases the latency. It really depends on the use-case at hand.

ioppermann avatar Sep 09 '22 18:09 ioppermann

I think that field needs definitely longer description than at the moment. I you way that most of the people would not understand what they are doing and start setting random numbers there. There should be declaration of related aspects that you should consider when changing this number so there would be somewhat informed decision.

aldas avatar Sep 09 '22 18:09 aldas

and probably gzipResponseWriter.buffer needs to be pooled as we are already pooling gzip.Writer

aldas avatar Sep 09 '22 19:09 aldas

Any more comments on this?

ioppermann avatar Mar 23 '23 20:03 ioppermann

Codecov Report

Patch coverage: 94.11% and no project coverage change.

Comparison is base (7d54690) 92.88% compared to head (684b743) 92.89%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2267   +/-   ##
=======================================
  Coverage   92.88%   92.89%           
=======================================
  Files          39       39           
  Lines        4526     4574   +48     
=======================================
+ Hits         4204     4249   +45     
- Misses        234      236    +2     
- Partials       88       89    +1     
Impacted Files Coverage Δ
middleware/compress.go 86.32% <94.11%> (+5.16%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Apr 12 '23 14:04 codecov[bot]

@ioppermann other than that test function should not consist of multiple testcases - divide it to multiple test functions or use table based tests, I think this PR is OK. After https://github.com/labstack/echo/pull/2355 I am little bit vary of changing this middleware but I can not find any problems to block to ATM :)

just to be sure I looked what other frameworks are doing - similar thing for Gin middleware https://github.com/nanmu42/gzip/blob/v1/writerwrapper.go FastHTTP does some minimal length checking https://github.com/valyala/fasthttp/blob/87cb886157ae414e178fd5bc14a57d5e1c8fcadb/http.go#L1766

@lammel could you please review this.

aldas avatar Apr 15 '23 21:04 aldas

@aldas I fixed the linter error from CI pipeline and I splitted up the tests such that each test only tests one thing.

ioppermann avatar Apr 17 '23 10:04 ioppermann

Serendipitously, just read the master branch code and tried to use the MinLength field, then realized this was only merged in the past 24 hours and hadn't been released yet.

This probably isn't the best place to request, but a tagged version would be great!

bnewbold avatar Jun 01 '23 02:06 bnewbold