orleans icon indicating copy to clipboard operation
orleans copied to clipboard

PartitionGrains and Rebalancing

Open wassim-k opened this issue 11 months ago • 2 comments

Hi, we a have a common scenario in our Orleans setup which involes the use of {Entity}PartitionGrains, those grains are responsible for 100,000s of entities and they forward events over implicit streams to the singular {Entity}Grain Generally the number of partitions is low, e.g. 8, 16, 32

Those EntityPartitionGrains can take many forms, e.g.:

  • Each partition has a Trill query instance running within it and is responsbile for doing some streaming analytics on its partition of entities before forwarding to EntityGrains.
  • Each partition has a RocksDB instance which efficiently updates and enriches the entity's data before forwarding to EntityGrains.
  • In a multi-tenancy application each TenantGrain subscribes to its own MQTT topic wildcard e.g. tenants/{TenantId|GrainId}/* and forwards payloads to EntityGrains.

This is an example of what the setup looks like: mermaid-diagram-2024-03-22-130338

This system works very well and performantly until we consider scaling.

Ideally, after a new silo joins the cluster, the placement of grains should change to: mermaid-diagram-2024-03-22-130727

There are 2 things at play here:

  1. The partition grains themselves are balanced evenly across active silos, since they are generally "heavy" grains.
  2. All active entity grains that belong to a certain partition are migrated across with their partition grain
    • This can also mean those grains are simply deactivated on the migration of their partition.
    • Or they are lazily migrated the next time the partition grain forwards a request to them.

Considerations

  • EntityGrains receive events and calls from other sources, for example they may react to domain events generated by other microservices that update their state, hence why moving those grains into the partition itself is not an option.

Questions

  1. Does the use PartitonGrains align with Orleans design or is it an uncommon use case?
  2. Does Orleans support a TypeActivationCount placement strategy? Or is it currently feasible to implement as a custom placement strategy? Specifically, is there a way to track the number of activations per silo for a specific grain type?
  3. Similarly, in your future plans for rebalancing of grains, (if you have given this some thought), do you envision it supporting balancing based on grain type activation count?
  4. Any ideas on how the (partition => grain) association may work when migrating/rebalancing of the partition grain?

wassim-k avatar Mar 22 '24 03:03 wassim-k