opentelemetry-operator Add handling of environment variables from ConfigMaps and Secrets

Description: <Describe what has changed.>

Added detection and reading of environment variables defined in PODs resources like ConfigMaps and Secrets. Supports both envFrom and valueFrom forms.

Link to tracking Issue(s): <Issue number if applicable>

Resolves:
- #1393
- #1814

Testing:

Added kubebuilder tests for new code, tested by deploying the generated image to OpenShift cluster.

Documentation: <Describe the documentation added.>

This is an enhancement, which can be read also as a bug fix (missing feature), so I have not added any documentation.

Implementation notes/questions:

The ConfigMap support is in first commit, the Secret support is in the second commit. They work the same, they both support “if exists” checks and are able to extend the existing value (append) into a newly created (or updated) environment variable.
I was not sure about the Container class naming (or have some common methods prefix?), so please suggest how the main class/file should be named - I can change it easily. The class handles everything regarding environment variables on the container currently.
I was not sure about the usage of namespace name. The sdkInjector.inject method gets one in arguments, but Apache and Nginx auto-instrumentations are ignoring it and taking namespace name from POD definition or from PODs resource map, which looks wrong (at minimum this is suspicious). I took the namespace name from the POD and if not found, I fall back to the one supplied as a parameter to inject method.
I was not sure about the reading of Secrets (second commit) - whether it is secure to handle them fully. The implementation is as full as for ConfigMaps, so the values can be detected (“if exists” checks) and extended (“append when exists”) into a new or updated environment variable (so Secret is converted to env var!). If necessary, I can change the implementation to just recognise them (record the existence) but fail when the value should be extended.

Oct 19 '24 15:10 oldium

Tested ConfigMaps with custom built image oldium/opentelemetry-operator:configmaps-2024-10-20-1 on our test cluster.

Oct 20 '24 20:10 oldium

Handling Secrets requires updated permissions, otherwise it ends up with:

failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:opentelemetry-operator-system:opentelemetry-operator-controller-manager" cannot list resource "secrets" in API group "" at the cluster scope

~I am really in doubt on how to handle Secrets~ I am not an K8s security expert, so regarding Secrets we can (1) read them (needs permission update) and put into env vars when appended (current implementation), or (2) to read them (needs permission update), but only record existence and fail when appending (this adds visibility of EnvFrom Secrets), or (3) ignore Secrets as it is now (only ValueFrom envs will be visible to OTEL, no EnvFrom variables).

Please suggest what to do. Thanks.

EDIT: Where does release asset Installation manifest for Kubernetes come from exactly? I am not able to find it in sources, so I cannot patch it. It differs from config/rbac/role.yaml, so it has to be something else. The config/rbac/role.yaml has:

- apiGroups:
  - ""
  resources:
  - namespaces
  - secrets
  verbs:
  - get
  - list
  - watch

while release asset:

- apiGroups:
  - ""
  resources:
  - namespaces
  verbs:
  - list
  - watch

Oct 20 '24 21:10 oldium

So the Secrets code works fine with the following kustomize patch to the Release asset (it syncs value to config/rbac/role.yaml state). Here is what I tested:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: opentelemetry-operator-system

resources:
- https://github.com/open-telemetry/opentelemetry-operator/releases/download/v0.110.0/opentelemetry-operator.yaml
- ... (other config)

patches:
- patch: |-
    - op: add
      path: /rules/2/resources/1
      value: secrets
    - op: add
      path: /rules/2/verbs/2
      value: get
  target:
    kind: ClusterRole
    name: opentelemetry-operator-manager-role

images:
- name: ghcr.io/open-telemetry/opentelemetry-operator/opentelemetry-operator
  newName: oldium/opentelemetry-operator
  newTag: configmaps-2024-10-20-1

Oct 20 '24 21:10 oldium

can you adjust an E2E test to ensure this logic works as expected?

You mean to add e2e tests for ConfigMaps and Secrets to ensure it works, right? Sure.

Oct 21 '24 19:10 oldium

@oldium please take a look at #3379 , it's an alternative solution that is much simpler.

Oct 22 '24 16:10 pavolloffay

@pavolloffay If I read it correctly, this is a solution for appending to an existing value and only when defined directly in the POD's env itself.

I am solving wider issue - my code is able to actually read ConfigMaps and Secrets, so when for example OTEL_SERVICE_NAME is defined in ConfigMap (via envFrom), it is not overwritten by OTEL Operator.

Also it allows defining JAVA_TOOL_OPTIONS in ConfigMap and it still works - OTEL Operator will append to the value correctly, becuse it is actually reading the values.

Oct 23 '24 14:10 oldium

So, to summarise. Your PR solves issue for the case when the environemntal variable is defined via valueFrom. My PR solves issue for the case when the environmental variable is defined either via envFrom or valueFrom.

Oct 23 '24 14:10 oldium

Conceptually, I don't like the fact that this PR effectively reimplements Kubernetes' environment variable mounting logic in the operator. This really shouldn't be necessary. In my mind, what the operator should do is add the env variables if the user hasn't set them, with some language-specific exceptions where it should concatenate the values instead. As #3379 shows, we don't actually need to read external ConfigMaps and Secrets to achieve the latter.

I do believe the operator may override some env variables set by the user in the CR, and we should fix that. But it's sufficient to add user-defined variables to the end of the list to do this, without need for all the elaborate machinery in this PR.

Oct 23 '24 14:10 swiatekm

We use ConfigMaps in base directory and customize them with kubectl kustomize in overlay directories for all environments and we do it for all our services. We set all environmental properties via ConfigMaps, they are easily overridable like this. Currently we are forced to set OTEL-specific environmental variables in Deployments and Argo-Rollouts via patches, because the value in ConfigMap is ignored. We only would like to use ConfigMaps for environmental variables, because this is feature of Kubernetes after all.

Oct 23 '24 21:10 oldium

As #3379 shows, we don't actually need to read external ConfigMaps and Secrets to achieve the latter.

You cannot put the OTEL_SERVICE_NAME and other environmental values in single (overridable) ConfigMap without directly referencing all the variables separately with valueFrom for #3379 to work.

Kubernetes can load the whole ConfigMap with envFrom field, this is much easier to handle. We have over 100 services running in 50 environments (tests, staging, production), so actually 100 base directories and 5000 overrides, so having ConfigMaps (which is simple key-value text file) referenced by envFrom in the service's Deployment/Rollout in base directory is simply much easier to handle than taking care of multiple valueFrom for each environmental value loaded from ConfigMap.

This is a long-standing maintaintenance pain for us - this Kubernetes feature works otherwise fine, but OTEL Operator misses it. Please consider adding this feature.

Oct 23 '24 21:10 oldium

Allright, that makes sense. I'd like to support what you're asking for, but I'd really like to avoid all the machinery in this PR. We'll discuss it during our next SIG meeting and hopefully find some alternative. For the Java and Node option variables specifically, I think #3379 can solve it as-is:

You set JAVA_TOOL_OPTIONS in your ConfigMap, which you attach via envFrom.
In the container env, you add JAVA_TOOL_OPTIONS: $(JAVA_TOOL_OPTIONS)
Operator sees the variable defined and merges it with its own options.

A slightly annoying workaround, but it should be easy to add to your kustomize base.

Oct 24 '24 11:10 swiatekm

Allright, that makes sense. I'd like to support what you're asking for, but I'd really like to avoid all the machinery in this PR. We'll discuss it during our next SIG meeting and hopefully find some alternative. For the Java and Node option variables specifically, I think #3379 can solve it as-is:

You set JAVA_TOOL_OPTIONS in your ConfigMap, which you attach via envFrom.

In the container env, you add JAVA_TOOL_OPTIONS: $(JAVA_TOOL_OPTIONS)

Operator sees the variable defined and merges it with its own options.

A slightly annoying workaround, but it should be easy to add to your kustomize base.

For like 100 services with 50 overlays each, it is better to have the values present just once. It is also easier to grep for the value afterwards when necessary. So if I have to choose between two JAVA_TOOL_OPTIONS (one placeholder value in the deployment, one actual value in the ConfigMap) and direct value in the deployment, I will choose the direct value in the deployment and skip the ConfigMap altogether. That is because not all services have JAVA_TOOL_OPTIONS, not all services have OTEL_SERVICE_NAME and so on - some have the value just in test overlays, some in base, some override it just for test environments and the like. It is really per-service and per-environment decision.

Before OpenTelemetry we used just ConfigMaps and were “happy” (almost, you know).

Oct 24 '24 12:10 oldium

I have rebased the code and added e2e tests.

Oct 24 '24 12:10 oldium

I am solving wider issue - my code is able to actually read ConfigMaps and Secrets, so when for example OTEL_SERVICE_NAME is defined in ConfigMap (via envFrom), it is not overwritten by OTEL Operator.

This is the current behavior of the operator. It checks all env vars and if any OTEL_ is specified it will honor it. The operator does not need to look into the config map for this.

Also it allows defining JAVA_TOOL_OPTIONS in ConfigMap and it still works - OTEL Operator will append to the value correctly, becuse it is actually reading the values.

The #3379 solves this issue as well by setting e.g. JAVA_TOOL_OPTIONS=$(JAVA_TOOL_OPTIONS) -javaagent (https://kubernetes.io/docs/tasks/inject-data-application/define-interdependent-environment-variables/).

Overall it's not clear what are the benefits of this PR compared to simpler #3379

Oct 24 '24 13:10 pavolloffay

I am solving wider issue - my code is able to actually read ConfigMaps and Secrets, so when for example OTEL_SERVICE_NAME is defined in ConfigMap (via envFrom), it is not overwritten by OTEL Operator.

This is the current behavior of the operator. It checks all env vars and if any OTEL_ is specified it will honor it. The operator does not need to look into the config map for this.

This is not how it works. If you define OTEL_SERVICE_NAME in ConfigMap, which is referenced by envFrom from the Deployment, the OTEL Operator creates a new OTEL_SERVICE_NAME env var in a POD, because it does not see anything what is defined inside the ConfigMap. OTEL Operator only sees env vars defined in the Deployment, i.e. the POD directly.

My colleague just faced the issue (again, sigh!) and had to move the OTEL_SERVICE_NAME environment variable from ConfigMap into Deployment.

Oct 24 '24 13:10 oldium

I see what you mean https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#configure-all-key-value-pairs-in-a-configmap-as-container-environment-variables

apiVersion: v1
kind: Pod
metadata:
  name: dapi-test-pod
spec:
  containers:
    - name: test-container
      image: registry.k8s.io/busybox
      command: [ "/bin/sh", "-c", "env" ]
      envFrom:
      - configMapRef:
          name: special-config
  restartPolicy: Never

This will define all configmap keys as env vars.

Oct 24 '24 13:10 pavolloffay

I see what you mean https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#configure-all-key-value-pairs-in-a-configmap-as-container-environment-variables
apiVersion: v1
kind: Pod
metadata:
  name: dapi-test-pod
spec:
  containers:
    - name: test-container
      image: registry.k8s.io/busybox
      command: [ "/bin/sh", "-c", "env" ]
      envFrom:
      - configMapRef:
          name: special-config
  restartPolicy: Never
This will define all configmap keys as env vars.

Yes, exactly. I have added e2e tests today into this PR, so you can find examples there too.

Oct 24 '24 13:10 oldium

@oldium we are concerned with this PR and we would like to avoid reading the configmaps in the operator.

There is issue for discussing alternative approaches https://github.com/open-telemetry/opentelemetry-operator/issues/3392

Oct 25 '24 13:10 pavolloffay