gziphandler icon indicating copy to clipboard operation
gziphandler copied to clipboard

Swappable gzip implementation?

Open whs opened this issue 4 years ago • 7 comments

Hi!

We at @wongnai have forked gziphandler internally to add swappable gzip implementation. In production, we swap compress/gzip with our fork of yasushi-saito/cloudflare-zlib which result in 43% less CPU percentage used in the middleware.

We haven't open sourced anything in this project yet, as they require extensive modification to all projects to make it work. I'd like to check with upstream first whether this is something you'll be willing to merge before starting to work on open sourcing it (eg. unfork the go module name).


The changes we made are:

  • Split gzipWriterPools, poolIndex, addLevelPool and their tests into another submodule
  • Add an interface for gzip implementation:
type GzipWriter interface {
	Close() error
	Flush() error
	Write(p []byte) (int, error)
}
  • The interface doesn't directly pool the underlying GzipWriter. The pooling is expected to be transparently done by the implementor of the interface. In the existing gzip implementation, the returned gzip.Writer is wrapped in a struct that when closed, also return the writer to the pool.
  • Implementations are swapped by build tag. The default build still use compress/gzip to avoid extra non-Go dependency.

For forked cloudflare-zlib and its integration with gziphandler we may open source it later after the PR made here is merged. We removed its built-in zlib code and just simply link with installed library.

whs avatar Jan 07 '20 15:01 whs

I'm also looking at this, and would love to have a swappable gzip implementation - or even just a faster builtin gzip implementation.

Profiling shows that gzip compression accounts for ~20% of the CPU cycles on our backends.

CAFxX avatar May 18 '20 23:05 CAFxX

Implementations are swapped by build tag. The default build still use compress/gzip to avoid extra non-Go dependency.

Is this needed? What if gziphandler was updated to take an (optional) GzipWriter and then users could customize it was needed (or not)? Or maybe I'm misunderstanding what you are proposing.

harrisonhjones avatar May 19 '20 15:05 harrisonhjones

Thanks for the suggestion, I agree that swappable interface is a better way to write it.

I've opened #106 to start open sourcing this

whs avatar Jun 15 '20 01:06 whs

I like your refactor in #106 and I think it should be merged.

However your alternate implementation of GzipWriter must not be in the same Go module (it might even be in another repo) for the following reasons:

  • to not add dependencies to the gziphandler module
  • yasushi-saito/cloudflare-zlib is apparently using CGO, and adding CGO code in this repo would not be wise. gziphandler has maximum portability and, as a user of the package, I want the project to keep that property.

dolmen avatar Apr 22 '21 11:04 dolmen

Good to see this thread still moving, although upstream still doesn't have plan to merge.

Haven't open source our gziphandler interface yet (maybe after it was merged). Our cloudflare-zlib was open sourced after this thread at https://github.com/wongnai/cloudflare-zlib . The major changes was to remove building of zlib with CGO (some glue C code remains) and instead dynamically link with it.

whs avatar Apr 23 '21 03:04 whs

https://github.com/klauspost/compress/tree/master/gzhttp#gzip-middleware is an updated and cleaned up fork.

klauspost avatar Jun 18 '21 14:06 klauspost

Just dropping by to mention that I also forked this library and added support for pluggable implementations (in addition to other features, like support for encodings beyond gzip, like zstandard and brotli): https://github.com/CAFxX/httpcompression

CAFxX avatar Jul 18 '21 08:07 CAFxX