booth icon indicating copy to clipboard operation
booth copied to clipboard

Support of alerts in booth site and boothd arbitrator

Open rohsaini opened this issue 4 years ago • 3 comments

I am requesting pacemaker alerts kind of feature for booth nodes. I understand few of below may not be possible to implement due to technical constraints. Would be interesting how most the cases can be handled. Please check the attached image.

  1. Node5 booth-arbitrator should be able to give event when any of the booth site node joins or leaves.
  2. Geo site booth should be able to give event when its booth peers joins/leaves. For example, Geo site1 gives an event when node5 booth-arbitrator joins/leaves OR site2 booth joins/leaves. booth-ip can be passed in event.
  3. On ticket movements (revoke/grant), every booth node(Site1/2 and node5) should give events.

If you see on high level, then these are kind of node/resource events wrt booth.

As of today wrt booth, there is no provision where any of the nodes gives any event when its peer leaves/joins. This makes it difficult to know whether geo sites nodes can see booth-arbitrator or not. This is true the other way around also where booth-arbitrator cannot see geo booth sites. I am not sure how others are doing it in today's deployment, but I see need of monitoring of every other booth node. So that on basis of event, appropriate alarms can be raised and action can be taken accordingly.

image

rohsaini avatar Aug 14 '20 04:08 rohsaini

Relevant ML thread: https://lists.clusterlabs.org/pipermail/users/2020-August/027533.html

I think for Booth best we can do is to call script when ticket is granted/rejected.

Also similarly with qdevice, we must make sure to call script asynchronously, so it doesn't block processing of other events.

@dmuhamedagic may have other ideas how to grasp this issue.

jfriesse avatar Aug 17 '20 07:08 jfriesse

On Mon, Aug 17, 2020 at 12:50:16AM -0700, Jan Friesse wrote:

Relevant ML thread: https://lists.clusterlabs.org/pipermail/users/2020-August/027533.html

I think for Booth best we can do is to call script when ticket is granted/rejected.

That is of course possible. In fact, most of the infrastructure for hooks is already there.

BTW, booth is a raft implementation and does have a notion of membership, so it would be possible to define more events. The FSM is documented in dot files in the documentation directory.

Also similarly with qdevice, we must make sure to call script asynchronously, so it doesn't block processing of other events.

The existing code invokes external programs asynchronously.

@dmuhamedagic may have other ideas how to grasp this issue.

Perhaps adding the SNMP support, which is the standard for such stuff.

dmuhamedagic avatar Sep 11 '20 15:09 dmuhamedagic

This enhancement would really be worth the trouble. Any volunteers? :)

dmuhamedagic avatar Sep 26 '23 11:09 dmuhamedagic