caddy icon indicating copy to clipboard operation
caddy copied to clipboard

Shared dictionary compression support

Open nickchomey opened this issue 1 year ago • 12 comments

Shared Dictionary Compression is a new technique that is starting to become available in browsers, which can reduce data transfer by up to 98%! It does this by leveraging brotli (not relevant to caddy) and ztsd's ability to use custom compression dictionaries when compressing a file - using, for example, the previous version of a script to compress the new version and sending the difference.

More on it all here https://developer.chrome.com/blog/shared-dictionary-compression

And here https://www.debugbear.com/blog/shared-compression-dictionaries

It is still in an Origin Trial in Chrome until April 30, 2024, so perhaps the Caddy team could inquire about it such that support could be implemented when appropriate?

The zstd package used by caddy already supports custom dictionaries https://github.com/klauspost/compress/blob/de4073a3abdd00a2a95e608f9fcaf6ebf9141cc0/zstd/README.md?plain=1#L324

So, it would be a matter of adding and listening for the relevant headers to and from the browser client.

Thanks for the consideration!

nickchomey avatar Mar 29 '24 19:03 nickchomey

We could do that. (I'm curious how they solved the privacy problems associated with shared dictionaries. I haven't read up on it.)

Doesn't surprise me that zstd supports it, but that's the least common encoding browsers accept, currently. We'd need support in gzip and brotli libs as well.

Will keep an eye on it!

mholt avatar Mar 29 '24 20:03 mholt

Brotli does support shared dictionary compression, but caddy doesn't (and seemingly won't, for performance reasons) support brotli. I doubt gzip would ever support this - I don't think it has any such custom compression dictionary mechanism.

It seems to me that zstd will continue to become more ubiquitous in browsers, cdns etc... (chromium 123 stable has it now https://caniuse.com/zstd, so edge etc have it too), so shared dictionary compression can just be a sort of "progressive enhancement" when a browser supports it (which, it seems to me, is how much of the browser header stuff works - be it content-encoding, languages, etc)

I'm sure the chrome team would be happy to explain if you join the origin trial! https://developer.chrome.com/origintrials/#/view_trial/2583940286203822081

Here's some more detailed resources

  • https://use-as-dictionary.com/
  • https://github.com/WICG/compression-dictionary-transport/blob/main/README.md

nickchomey avatar Mar 29 '24 21:03 nickchomey

This is exciting especially for those of us that pay egress bills to the cloud providers.

The custom dictionary is public with no ability for authentication. Ensuring privacy of information inside the dictionary is completely up to the implementor. Command-line training on a few dozen files looks like the way most would implement it.

ottenhoff avatar Apr 01 '24 21:04 ottenhoff

Brotli is supported via https://github.com/dunglas/caddy-cbrotli if you're willing to build Caddy with CGO. We can't add it to the website though because we disable CGO for our build server.

francislavoie avatar Apr 01 '24 22:04 francislavoie

The custom dictionary is public with no ability for authentication. Command-line training on a few dozen files looks like the way most would implement it.

The main point of shared dictionaries is that there's no singular custom dictionary. Literally every file/script can be its own custom dictionary, allowing up to like 99% compression on file changes.

The various links I shared above go into plenty more detail about this.

nickchomey avatar Apr 01 '24 22:04 nickchomey

The main point of shared dictionaries is that there's no singular custom dictionary. Literally every file/script can be its own custom dictionary, allowing up to like 99% compression on file changes.

Agreed, but implementing this in a load balancer means holding state of former files and that sounds like a lot of work. Implementing one single dictionary as an argument to Caddy's existing "encode zstd" sounds relatively straightforward?

ottenhoff avatar Apr 01 '24 23:04 ottenhoff

Yes, I suppose a singular custom dictionary mechanism could be made available (as well as other ztsd config options, such as compression level - which doesn't appear to be possible currently).

But thats not really the point of this Issue. It's to implement whatever might be necessary to make full use of browsers' Shared Dictionary mechanisms.

I don't really see why it should be done at the load balancer level - the application servers that are being load balanced should have sufficient state and can return the relevant files.

Moreover, it doesn't seem relevant to me whether this might be more difficult to implement with some architectures - if it doesn't work for some, so be it.

nickchomey avatar Apr 02 '24 02:04 nickchomey

It appears that Shared Dictionary compression will officially land in Chromium v130, which is set to be released in beta on September 18, and a stable public release on Oct 15, 2024.

The first link also shows that Firefox and Safari have signaled support for the feature, so it should eventually become a web standard.

It would be great to see this in Caddy, so that we can take advantage of it for Chromium, which represents at least 70% of global web traffic.

nickchomey avatar Aug 26 '24 14:08 nickchomey

An option to load a dictionary file should be pretty simple, as mentioned above. I'd welcome a PR to review!

mholt avatar Aug 26 '24 20:08 mholt