Instrumentation support selector
Component(s)
auto-instrumentation
Is your feature request related to a problem? Please describe.
Instrumentation applies to the specified Pod, similar to this:
https://skywalking.apache.org/docs/skywalking-swck/latest/java-agent-injector/#1-label-selector-and-container-matcher
Describe the solution you'd like
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: my-instrumentation
spec:
selector:
app: appname
Describe alternatives you've considered
No response
Additional context
No response
This approach has one considerable downside over the annotation. The annotation changes pod spec, therefore it triggers re-deployment which is needed by the OTEL operator to inject the auto-instrumentation (the operator uses pod mutating webhook)
therefore it triggers re-deployment which is needed by the OTEL operator to inject the auto-instrumentation (the operator uses pod mutating webhook)
Yeah, you're right.
However, we hope that all the configurations associated with OTel can be handed over to Operator, and if the annotation are used, we also need to maintain them ourselves during the deployment.
This is our usage scenario; I'll submit a PR if appropriate.
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: my-instrumentation
spec:
selector:
app: appname
propagators:
- tracecontext
java:
image: image
@crossoverJie we discussed this issue at the SIG (and realized we should have probably done this prior to your PR) and had a few follow up questions.
I saw the link you shared for how skywalking handles this, and at first glanced it seemed very similar, however, there's a key difference. Skywalking seems to be a java-only auto-instrumentation solution whereas opentelemetry can inject instrumentation for multiple languages. That results in what we agreed to be a confusing experience for users where a cluster admin needs to set a piece of configuration AND a tenant in the cluster needs to set configuration as well.
-
What are the user stories for this request and how do they differ from today? With the change you've made, it seems to be a very similar end state where both cluster admin and tenant need to do actions which feels against the goal of the issue which is to put the onus of instrumentation solely in the hands of the cluster admin.
-
What are the reasons for having multiple instrumentation resources for the same language in the same cluster? I'm not sure I understand the exact issue today that the current solution solves.
I was imagining it would be more beneficial for both personas if the cluster admin could specify a set of rules that specify a selector for applications to target for injection and specify which languages (and optionally container names) to inject. Let me know your thoughts here, and if you'd like to meet more to discuss the above I'd be happy to chat.
@jaronoff97
Thank you for following up on this issue.
That results in what we agreed to be a confusing experience for users where a cluster admin needs to set a piece of configuration AND a tenant in the cluster needs to set configuration as well.
Can you explain the relationship between cluster admin and tenants here?
In my scenario, I'm the only one maintaining Instrumentation.
2. What are the reasons for having multiple instrumentation resources for the same language in the same cluster? I'm not sure I understand the exact issue today that the current solution solves.
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: instrumentation-sit-grpc-consumer
namespace: sit
spec:
env:
- name: OTEL_SERVICE_NAME
value: grpc-consumer
selector:
matchLabels:
app: grpc-consumer
java:
image: autoinstrumentation-java:v1
extensions:
- image: extensions:v1
dir: /extensions
env:
- name: OTEL_RESOURCE_ATTRIBUTES
value: service.name=grpc-consumer
---
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: instrumentation-sit-grpc-provider
namespace: sit
spec:
env:
- name: OTEL_SERVICE_NAME
value: grpc-provider
selector:
matchLabels:
app: grpc-provider
java:
image: autoinstrumentation-java:v1
extensions:
- image: extensions:v2
dir: /extensions
env:
- name: OTEL_RESOURCE_ATTRIBUTES
value: service.name=grpc-provider
This is our usage scenario, and there are some application-specific configurations that need to be defined in different Instrumentations.
There are some other examples:
- New versions of javaagent are only tested on specific applications first.
- Some specific applications only need to report metrics but do not want to report traces.
- For resource isolation, the
OTEL_EXPORTER_OTLP_ENDPOINTreported by some applications may be different.
Therefore we like that each application can maintain its own Instrumentation file independently.
Thanks for adding more context! Given that you want each application to maintain its own Instrumentation, it's unclear to me why adding the name of the instrumentation in the application is more effort than today? All of the use cases you mentioned are solved with the existing system of annotations by allowing users to specify which instrumentation they want to use.
I still think the broader use case you describe is valuable, however, and it would be much simpler if the instrumentation selected its applications too. I think that instrumentation selector only works (or works best) when the user doesn't need to modify their application whatsoever to do so.
Thanks for your reply, the following issues may require further discussion.
it's unclear to me why adding the name of the instrumentation in the application is more effort than today?
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: order
name: order
spec:
template:
metadata:
annotations:
instrumentation.opentelemetry.io/container-names: "order"
instrumentation.opentelemetry.io/inject-java: "true"
# associate instrumentation
instrumentation.opentelemetry.io/instrumentation-name: "instrumentation-prod"
...
----
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: instrumentation-prod
namespace: prod
...
Do you mean to associate Instrumentation in deployment?
All of the use cases you mentioned are solved with the existing system of annotations by allowing users to specify which instrumentation they want to use.
I don't understand clearly here.
Similar to my last comment, in the current time we will create an Instrumentation for each application and define environment variables in this Instrumentation.
Is there any other way to configure these environment variables?
We don't want to configure startup parameters in the application, we want all configuration to be done by Operator.
I think that instrumentation selector only works (or works best) when the user doesn't need to modify their application whatsoever to do so.
Not sure I understand what you mean.
Currently there is no need to modify the application, just associate the application through the selector in the Instrumentation.
the example you linked demonstrates what i was referring to i.e. creating an instrumentation object and then referencing it directly via its name in the application.
Currently there is no need to modify the application, just associate the application through the selector in the Instrumentation.
There is a need to modify the application with at least the following in the world where we applied a selector as your PR does today. per your example:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: order
name: order
spec:
template:
metadata:
annotations:
instrumentation.opentelemetry.io/container-names: "order"
instrumentation.opentelemetry.io/inject-java: "true" # <<< This is always needed today
# associate instrumentation
instrumentation.opentelemetry.io/instrumentation-name: "true" # <<< This is still needed in your PR
What i'm thinking is a world where a cluster admin like yourself would define an instrumentation like so:
apiVersion: opentelemetry.io/v1alpha1
kind: InstrumentationRules
metadata:
name: instrumentation-prod
namespace: prod
spec:
rules:
- name: inject-java-app-prod
java: <java-config>
env: []
selector:
matchLabels:
app: order
env: prod
- name: inject-java-app-dev
java: <java-config>
env: []
selector:
matchLabels:
app: order
env: dev
This would then inject configuration for the following deployments without requiring any modifications in the deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: order
env: prod
name: prod-order
spec:
...
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: order
env: dev
name: dev-order
spec:
...
---
This to me would be closer to what skywalking can accomplish today and would allow cluster admins to entirely separate injection from application developers. Let me know what you think, thank you!
@jaronoff97 Sorry for the delayed response.
What i'm thinking is a world where a cluster admin like yourself would define an instrumentation like so:
Using a new CRD: InstrumentationRules seems like a good approach. We just need to move the configuration changes from Instrumentation to InstrumentationRules.
One more question: if we use InstrumentationRules, is it still necessary to keep Instrumentation? Or will Instrumentation be removed in a future update?
@crossoverJie this is a good question, I think before we can answer it we should have a design in mind for InstrumentationRules to see if it's a superset or distinct (there are tradeoffs for either approach). Would you be able to work on that design? If not, I can see if anyone else is available to do so. (my cmd+enter pressed closed... oops!)
my cmd+enter pressed closed... oops!
Hahaha, no worries.đŸ˜‚
Would you be able to work on that design?
I will draft a proposal for InstrumentationRules as soon as possible. We've been running it in our testing environment for some time with select, and we hope to use it in production soon. The main goal is to stay consistent with the upstream code.
apiVersion: opentelemetry.io/v1alpha1
kind: InstrumentationRules
metadata:
name: instrumentation-prod
# It will only select the Pods in this namespace.
namespace: prod
spec:
# If the rule is not specified, the exporter/propagators/sampler will be used
exporter:
endpoint: http://global-otel-collector:4317
propagators:
- tracecontext
- baggage
- b3
sampler:
type: parentbased_traceidratio
argument: "0.35"
rules:
- name: order-service
java:
image: ""
extensions:
- image: ""
dir: /extensions
exporter:
endpoint: http://prod-otel-collector:4317
propagators:
- tracecontext
- baggage
- b3
sampler:
type: parentbased_traceidratio
argument: "0.25"
env:
- name: OTEL_EXPORTER_OTLP_PROTOCOL
value: grpc
- name: OTEL_SERVICE_NAME
value: order-service
# Multi-container pods with single instrumentation
container-names: [order] # If there are multiple containers in the pod, specify the container name here, otherwise it will be use the first container in the pod.
selector:
matchLabels:
app: order
- name: user-service
# Multi-container pods with multiple instrumentations
java:
env: []
container-names: [user-java]
python:
env: []
container-names: [user-python]
selector:
matchLabels:
app: user
The purpose of this proposal is to flexibly configure Instrumentation for any application without modifying the Deployment.
Here, a resource named InstrumentationRules CRD is defined, which can implement the functionalities provided by Instrumentation.
Additionally, there are a few issues that need to be confirmed:
- [ ] Can multiple
InstrumentationRulesbe created?- [ ] I prefer to be able to create multiple, as operations can create an
InstrumentationRulesfor each namespace, which makes it easier to maintain data.
- [ ] I prefer to be able to create multiple, as operations can create an
- [ ] In a given namespace, all Pods use the same
InstrumentationRule.- [ ] I haven't found a particularly good way to declare it. Do you have any suggestions?
- [ ] Compatibility issues with
Instrumentation.- [ ] The first version prioritizes
Instrumentation, and once theInstrumentationRulefunctionality is stable, it can replaceInstrumentation.
- [ ] The first version prioritizes
This is just a draft, and I look forward to your reply.@jaronoff97 Thank you.
apiVersion: opentelemetry.io/v1alpha1
kind: InstrumentationRules
metadata:
name: instrumentation-rule-for-namespace
# It will only select the Pods in this namespace.
namespace: prod
spec:
# If usedForNamespace is true, the rule will be used for the namespace.
usedForNamespace: true
# If the rule is not specified, the exporter/propagators/sampler will be used
exporter:
endpoint: http://global-otel-collector:4317
propagators:
- tracecontext
- baggage
- b3
sampler:
type: parentbased_traceidratio
argument: "0.35"
rules:
# Only one rule is allowed for given namespace.
- name: used-for-namespace
java:
env: []
python:
env: []
In a given namespace, all Pods use the same
InstrumentationRule.
For this issue, I suggest adding a new field called usedForNamespace. When it is set to true, it indicates that the current instrumentationrules applys to the namespace. In this case, only one rule can exist.
If there are ohter InstrumentationRules applied to Pods at this time, they will have a higher priority.
@jaronoff97 Do you have any other suggestions? If not, I'd like to start developing this feature as soon as possible.
cc @swiatekm @iblancasa https://github.com/open-telemetry/opentelemetry-operator/issues/2744#issuecomment-2232789129 https://github.com/open-telemetry/opentelemetry-operator/issues/2744#issuecomment-2244476064
Here is a new proposal regarding select, If you have time, please take a look.
@crossoverJie thanks for working on this! I think it would be great if you could write an RFC for this given the functionality will be net new and there will be some edge cases and rollout issues I want to be sure we all understand and agree upon. I can reach out via slack and we can work on it. I'm going to be creating an RFC process and this feels like a great first RFC to write. If you wanted to start on an implementation so we can have something as a proof of concept, that would be great too.
May I know what is the latest update on this? Is this already implemented? Anyone working on this?
@iamvishnuks I'm planning on working on this sometime in the next couple of months, we've been having a lot of discussion around design considerations, breaking changes, etc. I'm hoping to be writing an RFC soon
@iamvishnuks I'm planning on working on this sometime in the next couple of months, we've been having a lot of discussion around design considerations, breaking changes, etc. I'm hoping to be writing an RFC soon
@jaronoff97 What's the progress on the RFC? If some parts are already complete, can we submit a draft and work together to move it forward?
If you don't have time to handle it for now, I can draft the RFC document.
@jaronoff97 Maybe we don't need to modify the CR; we can implement it by referring to this.
instrumentation.opentelemetry.io/container-names: xx
instrumentation.opentelemetry.io/inject-java: xx