controller should have leader election option when not running webhooks
Describe the solution you'd like I'd like to deploy e.g. the "audit" component statically, not using k8s deployments, in HA. It would be very convenient to leverage the controller-runtime's leader election capability for such. Enabling this via CLI flag would be ideal.
Anything else you would like to add: The program should probably crash if leader election AND webhooks were both enabled since webhooks should not depend on apiserver's leader election in a HA setting.
Environment:
- Gatekeeper version: 3.7.0
- Kubernetes version: (use
kubectl version): 1.21.7
@shomron I think we discussed this at some point. Do you know if that discussion is documented anywhere?
Definitely came up in a community meeting once.
I think this was discussed https://github.com/open-policy-agent/gatekeeper/discussions/1451#discussioncomment-1062583 (as noted in community call notes) however it seems it was deleted.
Argh, migrated to the discussions project maybe?
We turned off discussions because of opa/feedback migration, however existing discussions couldn't be transferred. I took a screenshot of it now:

Thanks for digging up that thread. I've already added a couple leader-election flags to my GK fork and will be testing soon. At some point, I'd like to nix the fork though.
Depending on the complexity/behavioral cost, I see no reason not to add it in principle. Curious to see what the PR would look like. I think @shomron was the most skeptical of this.
suggestions for a PR:
- two flags (in main.go)
leader-election(bool, true to enable)leader-namespace(ns for leader leases, default to "gatekeeper-system")
- ctrl.Options (in main.go)
- LeaderElection: maps to
leader-electionflag - LeaderElectionID:
gk-leader-12345(or whatever) - LeaderElectionNamespace: maps to
leader-namespaceflag - LeaderElectionResourceLock:
"leases"
- LeaderElection: maps to
I'd love to submit a PR for this; sadly there's a TON of corporate red tape in the way. The above changes yield a fairly small patch that's easy to test and backward compat when leader-election is false.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
/reopen
On Tue, Aug 30, 2022, 7:29 PM stale[bot] @.***> wrote:
Closed #1794 https://github.com/open-policy-agent/gatekeeper/issues/1794 as completed.
— Reply to this email directly, view it on GitHub https://github.com/open-policy-agent/gatekeeper/issues/1794#event-7291632631, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR5KLFXNUSOP4TEX5O6EALV32KLZANCNFSM5MDTNXRA . You are receiving this because you authored the thread.Message ID: @.*** com>
max-bot reopening @sozercan what should users do if they want to re-open a bug?
Not stale
On Tue, Aug 9, 2022, 3:23 PM stale[bot] @.***> wrote:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
— Reply to this email directly, view it on GitHub https://github.com/open-policy-agent/gatekeeper/issues/1794#issuecomment-1209784472, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR5KLDCA3FGRSFKDTYVHUDVYKV3JANCNFSM5MDTNXRA . You are receiving this because you authored the thread.Message ID: @.***>
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
not stale
@jdef Would you like to start a design doc and then bring the discussion to one of the community calls to drive this?
this seems like a pretty trivial change, is a design doc really needed? otherwise i'm happy to join a call
On Mon, Dec 19, 2022 at 10:11 AM Rita Zhang @.***> wrote:
@jdef https://github.com/jdef Would you like to start a design doc and then bring the discussion to one of the community calls to drive this?
— Reply to this email directly, view it on GitHub https://github.com/open-policy-agent/gatekeeper/issues/1794#issuecomment-1357815562, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR5KLHUPZLMNDQK2WHFN3DWOB3L7ANCNFSM5MDTNXRA . You are receiving this because you were mentioned.Message ID: @.***>
-- James DeFelice
Not sure if a full-scale design doc is needed, but would be good to know which controllers are in-scope for this (it sounds like singleton controllers only, such as status and audit), the other controllers should be running if we want hot standbys.
Also it'd be good to avoid the "webhooks cannot run next to audit with leader election enabled" edge case. Is it possible to have leader election be selectively enforced?
Of course, this does mean that a pod will self-restart if it loses the leader position, which, if serving a webhook endpoint, might have some interesting impacts to endpoint behavior.
You mentioned this is for pods not running via deployments and in other bugs you mentioned pods running off-cluster. I'm curious what the story for GC-ing stale by-pod status objects is?
Lastly, is there a reason for the leader election namespace to be different from G8r's namespace (by default this is determined via topdown and passed via env variable).
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.