orleans
orleans copied to clipboard
PartitionGrains and Rebalancing
Hi, we a have a common scenario in our Orleans setup which involes the use of {Entity}PartitionGrains, those grains are responsible for 100,000s of entities and they forward events over implicit streams to the singular {Entity}Grain
Generally the number of partitions is low, e.g. 8, 16, 32
Those EntityPartitionGrains can take many forms, e.g.:
- Each partition has a
Trillquery instance running within it and is responsbile for doing some streaming analytics on its partition of entities before forwarding toEntityGrains. - Each partition has a
RocksDBinstance which efficiently updates and enriches the entity's data before forwarding toEntityGrains. - In a multi-tenancy application each
TenantGrainsubscribes to its ownMQTTtopic wildcard e.g.tenants/{TenantId|GrainId}/*and forwards payloads toEntityGrains.
This is an example of what the setup looks like:
This system works very well and performantly until we consider scaling.
Ideally, after a new silo joins the cluster, the placement of grains should change to:
There are 2 things at play here:
- The partition grains themselves are balanced evenly across active silos, since they are generally "heavy" grains.
- All active entity grains that belong to a certain partition are migrated across with their partition grain
- This can also mean those grains are simply deactivated on the migration of their partition.
- Or they are lazily migrated the next time the partition grain forwards a request to them.
Considerations
EntityGrains receive events and calls from other sources, for example they may react to domain events generated by other microservices that update their state, hence why moving those grains into the partition itself is not an option.
Questions
- Does the use
PartitonGrains align with Orleans design or is it an uncommon use case? - Does Orleans support a
TypeActivationCountplacement strategy? Or is it currently feasible to implement as a custom placement strategy? Specifically, is there a way to track the number of activations per silo for a specific grain type? - Similarly, in your future plans for rebalancing of grains, (if you have given this some thought), do you envision it supporting balancing based on grain type activation count?
- Any ideas on how the (partition => grain) association may work when migrating/rebalancing of the partition grain?
- the partition grain aligns well with Orleans. How have you implemented it?
- ActivationCountBased placement works based on all activations rather than by type. I think this is generally the right strategy, but in your case you have some domain specific knowledge about the partition grains being these hubs for communication.
- No, not for the Active Rebalancing work, but this kind of soft anti-affinity makes sense to have as a separate policy. The placement policy is consulted when migrating a grain, and they can reevaluate where best to place the grain at that time. For the pattern you're describing, I think periodically rebalanced, eg on a timer, would be sufficient. To be clear: I think this would be a good feature to include in Orleans and it's complimentary to the Active Rebalancing work
Thanks for getting back to me @ReubenBond
- So we have few of partition grains as detailed in the examples above, but generally we have a predefined partition count and a startup tasks that activates all partitions.
- Do you see a combination of
[RebalancePeriodically]and[TypeActivationCountBasedPlacement]attributes as a possible solution in the future, which over time ensures those partition grains are balanced evenly across active silos?