mongoose icon indicating copy to clipboard operation
mongoose copied to clipboard

Zstd compression with dictionnary based on schema

Open billouboq opened this issue 9 months ago • 2 comments

Prerequisites

  • [x] I have written a descriptive issue title
  • [x] I have searched existing issues to ensure the feature has not already been requested

🚀 Feature Proposal

Hello !

I was just quickly thinking, would it be possible to create a zstd dictionnary based on document schema to make it wayyyyy faster to compress/decompress ?

Motivation

Increase performances

Example

No response

billouboq avatar Apr 09 '25 08:04 billouboq

I took a look and, while this is an interesting idea, I don't think Mongoose can support this right now because the MongoDB Node driver uses @mongodb-js/zstd custom zstd implementation, which doesn't support dictionary compression. Current API is just compress(data, compressionLevel) and decompress(data), no dictionary support. Do you have any ideas to work around this @billouboq ?

vkarpov15 avatar Apr 11 '25 19:04 vkarpov15

Drivers have considered dictionary support in the past but decided not to implement this feature (https://jira.mongodb.org/browse/DRIVERS-2396). This change would require server changes to support the dictionary used for compression server-side (the server + client must share the same dictionary used for compression), and that breaks the stateless behavior of existing client + server compression.

Also, open to suggestions about what it might look like to create a dictionary based on a schema, but all the underlying zstd APIs to create dictionaries train the dictionary from sample documents. I'm not sure what that would look like in Mongoose - would example documents be generated from the schema, serialized to bytes and then fed into the trainer? Or something else?

baileympearson avatar Apr 11 '25 20:04 baileympearson

I'm going to close this issue for now since there isn't a way for Mongoose to reasonably implement zstd dictionary support without significant changes from the MongoDB Node driver.

vkarpov15 avatar May 12 '25 19:05 vkarpov15