opentelemetry-specification
opentelemetry-specification copied to clipboard
What is part of "Automatic Instrumentation"?
What are you trying to achieve?
For our documentation at opentelemetry.io I tried to come up with a quick statement on "where is the OpenTelemetry agent"? (https://github.com/open-telemetry/opentelemetry.io/issues/1689 for reference, don't read into that lengthy discussion). @pellared & @reyang convinced me that the answer is Hey dear APM user, looking for an agent, if you want to instrument your applications without having to manually instrument your code, use the OpenTelemetry auto-instrumentation.
At first sight, this seems to be straight forward, since the spec definition of auto-instrumentation states that it "refers to telemetry collection methods that do not require the end-user to modify application's source code". Also, existing docs let the user know that in order to enable automatic instrumentation, one or more dependencies need to be added and configuration is available via environment variables and possibly language specific means.
What is not clear to mean, is what is contained in that definition of "Auto Instrumentation". Obviously it contains instrumentation of dependencies/libraries
The docs also mention configuration (Data source specific configuration, Exporter configuration, Propagator configuration, Resource configuration), but that seems not to be common across the implementations (see below), and there are other potential building blocks that might be part of "Automatic Instrumentation".
So, my question is, what of the following is contained in "Auto Instrumentation":
- is it only automatic do-not-touch-my-code library/dependency instrumentation
- is it also an automatic do-not-touch-my-code all-in-one-solution (i.e. beyond instrumentation it contains configuration, exporters, samplers, resource detectors, debugging & self telemetry, extensions, etc.)
- is this left to the language-specific implementations what is contained beyond do-not-touch-my-code library instrumentation?
If it is (1), then what is the name of that "all in one solution"?
If it is (2), that has some consequences, because of the difference in the use of language-specific implementations:
- Java has that "all in one solution" with the Javaagent
- .NET works towards that ("injects & configures the SDK into the application" & "adds instrumentation to key packages and APIs used by the application")
- python docs state that automatic instrumentations requires you to install a few Python packages to successfully instrument your application’s code. There's also the distro, which allows using OpenTelemetry and auto-instrumentation as quick as possible without sacrificing flexibility.
- Node.JS asks you to write a file yourself for wiring the SDK (config, exporters, resource detectors, etc) and has a package called auto-instrumentation-node, that is a meta-package for instrumentation libraries (see also this issue)
- Ruby docs state that automatic instrumentation in ruby is done via instrumentation packages, and most commonly, the opentelemetry-instrumentation-all package. and asks you to configure SDK by writing some additional code in your application
- Php & Go are just starting their implementations
- I don't think there are any for Rust, Swift, Erlang/Elixir.
So, some documentation might need to be rewritten (not problematic) and some packages might need a new name (problematic).
If it is (3), I am worried that end-users that move across languages will get confused, because they have different expectations what "Auto Instrumentation" contains. This can of course be managed (via documentation), but personally I would prefer having something consistent.
Additional context. My initial issue on where's the otel agent / what to use instead of an agent is based on my personal experience that many end-users coming from APM vendors are looking for that "agent-equivalent" in OpenTelemetry, which is this all-in-one-solution that they throw against their application and telemetry falls out of it. This is not about the question if agents make sense or no sense, it's about picking up end-users from where they are right now.
- There's also the distro, which allows using OpenTelemetry and auto-instrumentation as quick as possible without sacrificing flexibility.
I was going to say that Python is using "distro" wrong but now I see it isn't in the glossary. Last I remember a distribution (distro) meant a third party vendor's "bundle" of OpenTelemetry and vendor specific code/configuration.
Might be worth finalizing that definition as well, but may need a separate issue?
I was going to say that Python is using "distro" wrong but now I see it isn't in the glossary. Last I remember a distribution (distro) meant a third party vendor's "bundle" of OpenTelemetry and vendor specific code/configuration.
Might be worth finalizing that definition as well, but may need a separate issue?
IIRC "distro" was mentioned in the official docs but later changed from "distro" to "distributions" https://opentelemetry.io/docs/concepts/distributions/.
Yeah, Distro/Distribution is worth another (but related) discussion, @austinlparker added some comments on that: https://github.com/open-telemetry/opentelemetry.io/issues/1689#issuecomment-1262808129 & https://github.com/open-telemetry/opentelemetry.io/issues/1689#issuecomment-1263693791
But let's not have the discussion on Distribution here, the focus should be around clarifying "Auto-Instrumentation"
But let's not have the discussion on Distribution here, the focus should be around clarifying "Auto-Instrumentation"
Here you are: https://github.com/open-telemetry/opentelemetry-specification/issues/2873
following up on the conversation during the SIG call yesterday. Things got mixed with the "distribution discussion" (#2873), but that was intended, since there might be an overlap. I try to repeat a few of the things that have been brought up and said, which are relevant to this disucssion. please correct and chime in.
- Automatic Instrumentation describes the mechanism/technique used to accomplish a "do not touch my code" instrumentation. This is also in accordance with the definition of it in the spec glossary: "Refers to telemetry collection methods that do not require the end-user to modify application's source code. Methods vary by programming language, and examples include code manipulation (during compilation or at runtime), monkey patching, or running eBPF programs."
- A packaging of all the components (instrumentation libraries, exporters, resource detectors, etc.) is a "distribution" and there could also be a "reference/official" distribution by the community. Python is doing that already, and the collector is also having the "core distro" and the "contrib distro". Implication of that would be that for example the "Javaagent" is also a "Distribution of OpenTelemetry Java" in that sense.
We had similar points coming up in the initial discussion as well. @pellared, @austinlparker, @tsloughter, @theletterf, @reyang shared some of their thoughts on "Distribution, Auto Instrumentation, Agent, Instrumentation Library, Instrumentation Extension" as well.
Let me maybe rephrase what the background of this issue is and what my expected outcome is:
Many of the OpenTelemetry end-users come from APM vendors that provide them an "all-in-one-solution" that they can add to their application without changing code and it will send back telemetry to their backend of choice. With that expectation in mind, they come to the OpenTelemetry project and they look for a replacement of that "all-in-one-solution". Depending on the language they start with, that replacement they are presented with is either called "agent", "auto instrumentation", only "instrumentation", "distribution" or they even come to the conclusion that the "SDK" is doing that for them. So, they get confused (and I'm confused as well by that;-)), and that's a big blocker in their journey with OpenTelemetry.
Here's what I'd like to see: An end user with NO experience at all comes to the OpenTelemetry project looking to get observability for their application, will be presented with a consistent journey:
Step 1: Here's that "all-in-one-solution", and it will do EVERYTHING for you, just plug into your application without changing code, point it to your backend and you are done. Step 2: Learn how to mix that "all-in-one-solution" with manual instrumentation to "personalize" your observability. Step 3: Realize that you don't need that "all-in-one-solution" for your application, you can peel it like an onion and eventually end with only the SDK, exporters, resource detectors, instrumentation libraries, etc, that are relevant to your application Step 4: Understand that SDK & API are decoupled, so you can have dependencies that support otel natively and you can build those libraries with native otel support yourself. Step 5: 😍 or 🤑
following up on this issue. Throughout the repositories & the documentation "automatic instrumentation" is used to describe three different things:
- the mechanism to accomplish "don't touch my code instrumentation", used as a verb/action:
- "We automatically instrument and support a huge number of libraries, frameworks, and application servers" (4)
- " It's essential in languages that don't support fully automatic instrumentation and still useful in others." (7)
- "Many common modules such as the http standard library module, express, and others can be automatically instrumented using autoinstrumentation modules." (9)
- "Automatic instrumentation utilizes monkey-patching" (10)
- Synonym with "instrumentation libraries":
- "The JAR file contains the agent and all automatic instrumentation packages." (1)
- "Many popular components support automatic instrumentation" (3)
- "Optionally register automatic instrumentation libraries" (5)
- "Examples of this would include automatic instrumentation libraries" (6)
- "Many common modules such as the http standard library module, express, and others can be automatically instrumented using autoinstrumentation modules." (9)
- "This module provides automatic instrumentation for the fastify module" (11)
- capitel-letter all-in-one-solution dont-call-it-agent "Automatic Instrumentation"
- "Automatic instrumentation with Java uses a Java agent JAR" (2)
- _"You can use automatic instrumentation to initalize signal providers and generate telemetry data for supported instrumented libraries without modifying the application’s source code." (8)
- "Run the first script without the automatic instrumentation agent and the second with the agent." (10)
- "Automatic instrumentation is enabled by adding instrumentation packages" (11)
- "OpenTelemetry Go Automatic Instrumentation" (12)
- "OpenTelemetry auto-instrumentation extension" (13)
- "The operator can inject and configure OpenTelemetry auto-instrumentation libraries. Currently DotNet, Java, NodeJS and Python are supported." (14)
Those in (2) could be rewritten by using the term instrumentation library, and it's also not contradicting with (1) if a library is instrumented with a certain auto-instrumentation mechanism. There's nothing wrong with saying "an instrumentation library autoinstruments a library for you by using byte code injection, monkey patching, etc".
My concern lays with (3) vs (1) & (2), or so to say the mechanism/action "automatic instrumentation" vs the capital-letter all-in-one-solution dont-call-it-agent "Automatic Instrumentation" for applications. This is not only highly confusing for end-users, but as you can see from the examples above it's also hard for contributors to write docs&manuals properly
I hope to steer conversation towards a slightly different direction:
following up on this issue. Throughout the repositories & the documentation "automatic instrumentation" is used to describe three different things:
- the mechanism to accomplish "don't touch my code instrumentation", used as a verb/action:
I feel the intention here is not about "touch my code or not", it is more about two concerns:
- As a developer who owns the source code, do I need to understand the API and SDK to write some code, or I can just follow some onboarding steps to get telemetry data.
- OpenTelemetry Go might provide such mechanism to transform the code via some automation tools, it does require code access + rebuild/deployment.
- As an operator who does not own the source code (or does not even have access to the source code), can I follow some onboarding steps and get telemetry data.
The verb "automatically instrument" should work fine here. E.g. "automatically instrument your Go code by doing some automatic code transformation", "automatically instrument your Java application by simply setting some environment variables".
- Synonym with "instrumentation libraries":
I think we should stick with the term "instrumentation libraries" or "instrumented libraries", and avoid using these terms for anything that is not covered by https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/glossary.md#instrumented-library and https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/glossary.md#instrumentation-library.
- capitel-letter all-in-one-solution dont-call-it-agent "Automatic Instrumentation"
This seems to be a distro concern. Don't we have the same problem with SDK? (e.g. do we try to distinguish "OpenTelemetry C++ SDK" and "OpenTelemetry C++ SDK with a bunch of 1st class exporters"?)
I hope to steer conversation towards a slightly different direction:
following up on this issue. Throughout the repositories & the documentation "automatic instrumentation" is used to describe three different things:
- the mechanism to accomplish "don't touch my code instrumentation", used as a verb/action:
I feel the intention here is not about "touch my code or not", it is more about two concerns:
As a developer who owns the source code, do I need to understand the API and SDK to write some code, or I can just follow some onboarding steps to get telemetry data.
- OpenTelemetry Go might provide such mechanism to transform the code via some automation tools, it does require code access + rebuild/deployment.
As an operator who does not own the source code (or does not even have access to the source code), can I follow some onboarding steps and get telemetry data.
The verb "automatically instrument" should work fine here. E.g. "automatically instrument your Go code by doing some automatic code transformation", "automatically instrument your Java application by simply setting some environment variables".
ACK. I might have not expressed it elegantly, but that's what I wanted to say with "don't touch my code". Actually, you could even say that the K8s operator "auto instruments" the workload, which adds a layer more of complexity.
- Synonym with "instrumentation libraries":
I think we should stick with the term "instrumentation libraries" or "instrumented libraries", and avoid using these terms for anything that is not covered by https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/glossary.md#instrumented-library and https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/glossary.md#instrumentation-library.
ACK, I try to find the instances where this is used differently. It's not a big concern in docs or other written material, but of course might be an issue for packages, etc.
- capitel-letter all-in-one-solution dont-call-it-agent "Automatic Instrumentation"
This seems to be a distro concern. Don't we have the same problem with SDK? (e.g. do we try to distinguish "OpenTelemetry C++ SDK" and "OpenTelemetry C++ SDK with a bunch of 1st class exporters"?)
That's what all this issue is about (and you might have extended the scope of that): how are certain building blocks called consistently, so that end-users can expect the same experience: API&SDK are defined within the spec (it even includes some required exporters, e.g. OTLP, Logging, Prometheus, Jaeger, Zipkin), but what about
- a bundle of instrumentation libraries provided by the otel community
- out-of-spec things included in a language SDK (additional exporters, common non-cloud resource-detectors, ...)
- that "automatic instrumentation thingy" I use as an operator who does not own the source code