notification-controller icon indicating copy to clipboard operation
notification-controller copied to clipboard

Add possibility to configure wildcard namespace value in Alert eventSources

Open anders-olofsson opened this issue 3 years ago • 15 comments

It would be nice to pick up all error events within a cluster using a wildcard namespace value in the eventSources section of an Alert - similar to how it works for name: https://github.com/fluxcd/notification-controller/blob/fbf1ea0413e12fe58e6386972468a152c42b215c/internal/server/event_handlers.go#L77

Currently you'll either need to duplicate Provider (including any secrets) and Alert resources to all relevant namespaces or create a "global" Alert which references other namespaces. It would be simpler and less error prone to allow namespace: "*".

anders-olofsson avatar Nov 03 '20 10:11 anders-olofsson

I dont see any issue implementing this, as it would just be a catch all for all events in all namespaces. Or events from a specific resource kind in all namespaces. As we already allow event sources from multiple namespaces it would just act as a helper to avoid having to type out each namespace. One thing to keep in mind is that there would probably be a lot of events generated from doing this.

@stefanprodan do you have any opinion about this?

phillebaba avatar Nov 03 '20 10:11 phillebaba

This would break multi-tenancy, imagine a tenant will create a "global" alert in their namespace and it will receive events with sensitive information about all the other tenants. I find this unacceptable, imagine if AWS would allow anyone to route all events from Cloudwatch no matter the account.

stefanprodan avatar Nov 03 '20 10:11 stefanprodan

I have also been thinking about that. In theory it would be possible if we were able to limit the event sources to the permissions the service account that is assumed has. We can obviously not allow anyone to export all events in the cluster.

phillebaba avatar Nov 03 '20 12:11 phillebaba

Not sure if it helps, but it'd be fine if it was only allowed for alerts defined in the flux-system namespace.

anders-olofsson avatar Nov 11 '20 16:11 anders-olofsson

Not sure if it helps, but it'd be fine if it was only allowed for alerts defined in the flux-system namespace.

This would work for us.

That said, I think the issue around multi-tenancy is a cluster policy issue rather than something that needs to be solved by this project. It's up to the cluster administrator to set policy on who can create Alert resources, and whether they can create them with a namespace of *.

Perhaps as a stopgap, the ability to have * for the namespace could be behind a flag, so that the cluster administrators have a simple way to allow or disallow this functionality?

Niksko avatar Jan 03 '21 23:01 Niksko

How about a new kind, ClusterAlert, in order to be alerted on all object clusterwide ? Like ClusterRole and Role.

nvanheuverzwijn avatar Sep 30 '21 15:09 nvanheuverzwijn

This would break multi-tenancy, imagine a tenant will create a "global" alert in their namespace and it will receive events with sensitive information about all the other tenants. I find this unacceptable, imagine if AWS would allow anyone to route all events from Cloudwatch no matter the account.

This is no way equivilant comparison. We allow flux in a single namespace(flux-system) to manage resources in all other namespaces without explicitly opening up each namespace for flux controllers. In the same approach, we should be able to set notifications+alert for those resources.

arash-bizcover avatar Nov 19 '21 07:11 arash-bizcover

I, as many others, would have a use case for the proposed behaviour, and would be happy to put some work towards it. I've used Flux extensively for about a year now but never really dug into the codebase.

For what it's worth I feel like @nvanheuverzwijn's suggestion of having a non-namespaced ClusterAlert resource is the cleaner approach, because it doesn't break any existing functionality, accurately represents what namespace: '*' would try to achieve on a more abstract level, and would allow using RBAC and other familiar mechanisms for multi-tenancy scenarios.

Could this issue be put onto the agenda for the next meeting or something? It seems that there's some disagreement about if and how this should be implemented at all, and I'd like to have a clear goal for what a PR solving this issue should entail.

itspngu avatar Dec 10 '21 15:12 itspngu

A ClusterAlert doesn't solve much because it would refer to a ClusterProvider that would refer to a Kubernetes secret, and Kubernetes team rejected the proposal of having ClusterSecrets. I'm for revisiting the namespace wildcard option after RFC-0003 gets approved.

stefanprodan avatar Dec 10 '21 15:12 stefanprodan

Since #319 implemented a way to disable cross-namespace references, is this something you would consider now @stefanprodan?

au2001 avatar Feb 17 '22 18:02 au2001

To add some context here, a piece of feedback I receive from developers after rolling out flux2 is that they don't know fast enough if their helm releases fail. Our cluster setup is a namespace per customer (lots of namespaces). We are monitoring this externally via datadog checks, but it would be great if the infrastructure team could take care of this for developers. Our current workaround is enumerating namespaces and making lots of objects, but it's slightly error prone and noisy.

cep21 avatar Feb 17 '22 18:02 cep21

+1 - Would be nice to have this feature and allow wildcards for namespaces.

jprecuch avatar Apr 01 '22 06:04 jprecuch

How are others working around this issue? I'm thinking of writing a poll based system to cron the flux CLI.

cep21 avatar Apr 06 '22 16:04 cep21

There is already an option to prevent cross-namespace alerts: https://fluxcd.io/flux/components/notification/alert/#disable-cross-namespace-selectors

So there should be nothing stopping us from implementing * wildcard namespaces, and just don't allow them if that flag is enabled.

JVMartin avatar Jan 08 '23 02:01 JVMartin

I have stumbled over the non-ability to have a central approach on notifications in my clusters as well. I have all clusters managed with HelmReleases for all applications running, and I have, as typically, one namespace per application. I have decided to have the HelmRelease resources inside the namespaces of the applications, because that's also the place where the "helm chart installs", meaning that when doing a helm ls inside the namespace, the chart doesn't appear when the HelmRelease resource is in flux-system, for example. The non-ability of an Alert resource to reference a Provider in another namespace and the inability of the Alert resource to catch up events from other namespaces brings me into the situation of not being able to use it at all: I would have to deploy an Alert resource, a Provider resource AND the necessary secrets for them to work in every single application workspace. That's no fun and security wise worse than having a centralized approach where a "global" Alert resource is able to take care of the entire cluster.

fibbs avatar May 12 '24 07:05 fibbs