orleans icon indicating copy to clipboard operation
orleans copied to clipboard

DynamoDB grain state storage using compression

Open smiron opened this issue 2 years ago • 3 comments

Problem

Reading and writing large grain states can lead to excessive resource utilization when using DynamoDB as the grain state store.

Solution

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-use-s3-too.html

One of the best practices when using DynamoDB is to compress the large attributes. In this specific case, this translates to compressing the BinaryState and StringState attributes.

As part of this pull request, the below items are implemented:

  • Store the grain state in a byte array in DynamoDB
  • Store the grain state properties in a DynamoDB Map. At the moment, we support the properties: Serialization, Compression
  • Ability to store the grain state in DynamoDB in a compressed format
  • Ability to control the compression parameters via the config policy
  • Ability to control the serialization format via config values
  • Ability to seamlessly move between serialization formats, compression formats, and compression parameters
  • Seamless migration from the current storage format to the new one

Using GZIP compression with a mode of FASTEST for items of over 2048 bytes yields the below results:

  • Latency

    Mean: 39.87 us Error: 0.784 us StdDev: 0.871 us Median: 39.73 us

  • Compression ratio: 494 %

Microsoft Reviewers: Open in CodeFlow

smiron avatar Dec 15 '22 11:12 smiron

I think adding compression to the serializer can be a good idea. I think it could be added to the existing IGrainStorageSerializer interface. It could benefits all providers that are already using this interface.

We are aware that the current implementation doesn't allow to migrate serializer easily (it is still doable via IGrainStorageSerializer composition). We can try to make some changes for Orleans 8, but not before, since it will break backward compatibility.

benjaminpetit avatar Dec 19 '22 13:12 benjaminpetit

I think adding compression to the serializer can be a good idea. I think it could be added to the existing IGrainStorageSerializer interface. It could benefits all providers that are already using this interface.

We are aware that the current implementation doesn't allow to migrate serializer easily (it is still doable via IGrainStorageSerializer composition). We can try to make some changes for Orleans 8, but not before, since it will break backward compatibility.

I'm not convinced that this feature should make it to all storage providers. Most (if not all) SQL engines have the capability to parse JSON string and therefore enabling compression would inhibit that. Also, some SQL engines support compression at the SQL engine level and therefore adding compression in Orleans would actually hurt performance while not providing any benefit.

DynamoDB on the other hand doesn't have any kind of JSON parsing capability and it doesn't implement compression either. For this reason DynamoDB is a good example of where compression should be implemented.

Given your comment RE implementing compression in Orleans 8, would it make sense to repoint this PR at the main branch instead?

smiron avatar Dec 20 '22 11:12 smiron

I didn't notice that your PR is targeting 3.x. Yes, if we end up implementing something like that, that will be in 8.

benjaminpetit avatar Dec 28 '22 17:12 benjaminpetit