flagger icon indicating copy to clipboard operation
flagger copied to clipboard

Flagger Environment Detection - Kubernetes Downward API?

Open kingdonb opened this issue 1 year ago • 1 comments

What I was wondering is, could Flagger build some opt-in feature like the Kubernetes Downward API, or could Downward API be used for this purpose, to make it much easier for a Flagger-enabled app to detect when it is running in a Canary or not, without creating additional RBAC grants for every Flagger-enabled app that needs to do this?

I think it can, with a bit of knowledge about how Flagger works. I was wondering a bit more directly if Flagger could add environment variables that answer this question directly, so the program can get this information without embedding such logic.

I don't see any mechanism described in the Downward API docs for extending it, but Flagger creates the pods/deployments so I guess it would be in a position to add environment variables at the very least, (I don't think we already do this yet but I'm not sure if we would ever consider doing it?)

If they use a volume fieldref in the Kubernetes Downward API, then I think it's already possibly a solution that doesn't require any change in Flagger. Should we document this somewhere in Flagger, to save them from doing all this research (am I coming to a correct conclusion reading this doc here):

https://kubernetes.io/docs/concepts/workloads/pods/downward-api/#downwardapi-fieldRef

The following information is available through a downwardAPI volume fieldRef, but not as environment variables: metadata.labels all of the pod's labels, formatted as label-key="escaped-label-value" with one label per line

I think based on my code in my app for deciding whether I am a canary or not, that it is probably possible to tell from the downward API, based just on the labels on the current pod, and nothing else, whether you have a canary pod or not. Is this documented anywhere, or does it make sense to add it to the Flagger documentation? I think this is a pretty common idea for Flagger users.

Is there any focused documentation about detecting Flagger from within Kubernetes? Eg. I've written apps before where I felt it was important for the UI/UX that (at least admin) users can tell whether their requests are being served by canary pod or primary, so I built an interface with several service modules Kubernetes, Flagger, and Canary, and a helper module Canaries

I used the HOSTNAME environment variable to determine which pod identity my pod has from inside the pod, I gave the pod RBAC access to list/read Canary resources from the Flagger API. All this before I knew about Kubernetes Downward API, I figured that I can know which labels my application gets from inside of the app if I can list pods, so it's easy enough to get all the pods (another RBAC grant) and filter from labels until I have only the primary pods, or only the canary pods, and check for my identity inside of either of those lists to know whether I am a Canary pod or not. This is a bit more complicated but it still works, and it gives me more information that I need for my admin interface, plus more information than I would get through the Downward API today.

That's several round trips, so if I needed this information on every page render I probably would look for a more efficient way to determine if I was a Canary pod or not, perhaps Downward API. I think that a pod will always be a canary or a primary so I can check this information once at pod startup and not repeat. Is it worth documenting any of this as well? It seems like there might be a place for a document that spells all of this out explicitly. I'd be happy to contribute something if we think that's a good idea.

I've heard from users that they are doing this or similar with their Flagger workloads, for more directly operational purposes than my admin interface.

For example there's a bit of code that selects a Kafka topic subscription for the application, one topic should only be used by primary pods, and the canary pods are pulling from a different topic. They need to be able to tell from their code if they are running a canary or primary pod, so they can select the appropriate topic during the rollout. I'm not sure about the details, but I think we can get those users involved in the discussion if we have questions about it.

kingdonb avatar Oct 27 '23 13:10 kingdonb

I consider this a very important feature that is currently missing in Flagger.

I had to resolve from withing the pod whether it is a primary or canary for this specific case of changing the kafka topic that the pod subscribes to. To resolve this, I have used the Kubernetes Downward API to define a environment variable with the pod name. Since flagger will attach a "-primary" suffix to the primary pods, and none to the canaries, I have built this simple logic inside my code to select the appropriate kafka topic. While this is a simple solution, I consider it not appropriate at all. In practice one will likely be deploying a third-party image and forced to implement this logic perhaps in a init-container.

In case Flagger would add a label or annotation to the pods, e.g., flagger.stage: "primary" and flagger.stage: "canary", then it would allow to use the Kubernetes Downward API to define environment variables specific to each stage without the need to add any custom logic to the app.

Flagger already supports adding labels and annotations to the generated services (see https://docs.flagger.app/usage/how-it-works#canary-service) in a very flexible way so that users can define their own labels/annotations for the primary and canary. Implementing the same feature to allow defining labels/annotations to the pods would be a very elegant way to avoid the hassle above.

Another use-case for such a feature is to enable better observability. For example, it would be possible to use the canary/primary labels to compute metrics for each stage separately, or tagging traces. This would be very interesting to observe the effect of a canary microservice in the whole application, for instance.

burigolucas avatar Oct 28 '23 21:10 burigolucas