spg icon indicating copy to clipboard operation
spg copied to clipboard

Some extra features?

Open vans163 opened this issue 5 years ago • 6 comments

What are your thoughts on join/leave subscriptions and allowing metadata with the registration?

vans163 avatar Jan 22 '20 18:01 vans163

Yes, I am working on these two, but do not have any specific timeframe yet.

max-au avatar Jan 27 '20 20:01 max-au

Waiting for ForgETS as well :P, would be nice to try it, even if its not ready (if its planned to.be open source).

vans163 avatar Jan 27 '20 21:01 vans163

Better late than never.... pg monitoring (join/leave subscriptions) have been implemented ~October 2021 (and now available in OTP 25.1 and above). Of course, spg has it as well.

Metadata is a trickier question, I still don't have a performant enough implementation. But there is a solution for that, with full scope monitoring. Effectively we run a few more processes monitoring groups, and those processes attach necessary metadata.

max-au avatar Feb 09 '23 16:02 max-au

I used a hack for metadata where the scope was the metadata, and the query became a direct ETS lookup. I think it was mentioned on OTP issue tracker as well.

worker = :physical_node_X
:pg.join({PGInferenceWeight, worker.uuid, jobWeight}, self())

    def inference_job_weight(worker_uuid) do
        :ets.select(:pg, [{
            {{PGInferenceWeight, worker_uuid, :"$1"},:"$2",:_}, 
            [], 
            [{{:"$1", :"$2"}}]
        }])
        |> Enum.reduce(0, fn({weight, pids},acc)-> 
            acc + weight*length(pids)
        end)
    end

Example is: We have a physical worker node that does AI inference, and it can run more than 1 inference in parallel. Because different types of inference take different amount of resources, and we know cost upfront, we assign weight to each. We can join PG group + query then sum the weights and if cum_weight >= 1 not queue further inferences on that node. If anything goes wrong like a client randomly drops, the inference request will autoleave the group and neatly adjust the cum_weights next time its queried. This leads to much cleaner looking (and less buggy/racy) code. In the super rare case there is a race (as there is no locks or sync primitives) we don't really care the node will simply get a bit overloaded for the next few seconds. So it becomes like best effort load balancing.

About ForgETS, we coded this up https://github.com/xenomorphtech/mnesia_kv. Its missing distributed functionality (so far company not fortunate enough to reach the scale needed) which ideally would be ~~C~~ A P, with the C handled by a group leader (if you want basic C you need to execute TX on the leader node).

vans163 avatar Feb 12 '23 03:02 vans163

We are using the same technique (encoding sharding/partitioning in the Group Name), but one thing that is missing from spg/pg is ETS table type. Right now it's hardcoded as a set, while it would be more performant to use ordered_set for faster selection in a consistent hash ring.

max-au avatar Feb 13 '23 22:02 max-au

We are using the same technique (encoding sharding/partitioning in the Group Name), but one thing that is missing from spg/pg is ETS table type. Right now it's hardcoded as a set, while it would be more performant to use ordered_set for faster selection in a consistent hash ring.

This would be nice to make it ordered_set, or allow user to configure it when they init pg.

vans163 avatar Feb 17 '23 18:02 vans163