Some extra features?
What are your thoughts on join/leave subscriptions and allowing metadata with the registration?
Yes, I am working on these two, but do not have any specific timeframe yet.
Waiting for ForgETS as well :P, would be nice to try it, even if its not ready (if its planned to.be open source).
Better late than never.... pg monitoring (join/leave subscriptions) have been implemented ~October 2021 (and now available in OTP 25.1 and above). Of course, spg has it as well.
Metadata is a trickier question, I still don't have a performant enough implementation. But there is a solution for that, with full scope monitoring. Effectively we run a few more processes monitoring groups, and those processes attach necessary metadata.
I used a hack for metadata where the scope was the metadata, and the query became a direct ETS lookup. I think it was mentioned on OTP issue tracker as well.
worker = :physical_node_X
:pg.join({PGInferenceWeight, worker.uuid, jobWeight}, self())
def inference_job_weight(worker_uuid) do
:ets.select(:pg, [{
{{PGInferenceWeight, worker_uuid, :"$1"},:"$2",:_},
[],
[{{:"$1", :"$2"}}]
}])
|> Enum.reduce(0, fn({weight, pids},acc)->
acc + weight*length(pids)
end)
end
Example is: We have a physical worker node that does AI inference, and it can run more than 1 inference in parallel. Because different types of inference take different amount of resources, and we know cost upfront, we assign weight to each. We can join PG group + query then sum the weights and if cum_weight >= 1 not queue further inferences on that node. If anything goes wrong like a client randomly drops, the inference request will autoleave the group and neatly adjust the cum_weights next time its queried. This leads to much cleaner looking (and less buggy/racy) code. In the super rare case there is a race (as there is no locks or sync primitives) we don't really care the node will simply get a bit overloaded for the next few seconds. So it becomes like best effort load balancing.
About ForgETS, we coded this up https://github.com/xenomorphtech/mnesia_kv. Its missing distributed functionality (so far company not fortunate enough to reach the scale needed) which ideally would be ~~C~~ A P, with the C handled by a group leader (if you want basic C you need to execute TX on the leader node).
We are using the same technique (encoding sharding/partitioning in the Group Name), but one thing that is missing from spg/pg is ETS table type. Right now it's hardcoded as a set, while it would be more performant to use ordered_set for faster selection in a consistent hash ring.
We are using the same technique (encoding sharding/partitioning in the Group Name), but one thing that is missing from
spg/pgis ETS table type. Right now it's hardcoded as aset, while it would be more performant to useordered_setfor faster selection in a consistent hash ring.
This would be nice to make it ordered_set, or allow user to configure it when they init pg.