compression icon indicating copy to clipboard operation
compression copied to clipboard

Support custom dictionaries

Open ricea opened this issue 5 years ago • 3 comments
trafficstars

The "deflate" format supports preset dictionaries. These permit backreferences to be used from the start of the data to refer to items in the dictionary as if it was prepended to the uncompressed data. This can give significant improvements in compression ratio, particularly for small inputs. See FDICT in RFC1950. This is also a common feature in other compression formats.

This should be supported by CompressionStream and DecompressionStream.

For CompressionStream, an obvious API would be

const cs = new CompressionStream("deflate", { dictionary: aBufferSourceObject });

An open question is whether it is necessary to be able to pass multiple dictionaries to DecompressionStream (keyed by the Adler32 checksum), or whether just passing a single dictionary is sufficient. If we only support passing a single dictionary, this requires the calling code to either know by some out-of-band method what dictionary is in use, or parse the Adler32 checksum out of the header itself to choose the right dictionary.

ricea avatar Feb 06 '20 04:02 ricea

API responses with a fixed schema would be an obvious use case here. For example, one could generate a dictionary from a GraphQL schema and even weighted by the number of requests per field key.

mormahr avatar Oct 22 '20 01:10 mormahr

@mormahr How do you feel about { dictionary: aBufferSourceObject } vs. { dictionaries: { 0x12345678: aBufferSourceObject } }

We could of course support both, but that would be ugly.

ricea avatar Oct 22 '20 04:10 ricea

I have no idea how all of that works. Since there isn't anything available in the browser (yet), I haven't looked further into my idea, so I don't really know what the proper API design is.

mormahr avatar Oct 22 '20 13:10 mormahr