opentelemetry-js How to make metadata accessible

How to make metadata accessible

Open blumamir opened this issue 1 year ago • 3 comments

trafficstars

I am exploring ways to enhance our handling of metadata for instrumentations, aiming to streamline processes and boost efficiency.

Instrumentation (or OpenTelemetry component) metadata comprises static information about OpenTelemetry JS instrumentation (or other components) that is valuable for distributions, control planes, APMs, and similar tools.

We currently record the name and version for each instrumentation, which also serves as the scope name for the signals we emit

Although metadata is not recorded into signals, it can significantly enhance user experience and automate tasks when utilized by distributions, offering a smoother and more intuitive interface.

Metadata Examples

instrumentation description - this text is currently found only in package.json. It provides a concise, user-facing description that includes the instrumented packages and OpenTelemetry context. It was aligned across the codebase to have consistent and meaningful content in #4715 and https://github.com/open-telemetry/opentelemetry-js-contrib/pull/2202. Example text: "OpenTelemetry instrumentation for the amqplib messaging client for RabbitMQ"
Instrumented packages and supported version range - this text is currently only found in the README.md of each instrumentation. https://github.com/open-telemetry/opentelemetry-js-contrib/pull/2196 is an attempt to align it across the codebase. The instrumented packages is the user-facing package name, which can defer from the "patched packages" which init() returns. The instrumented package is the most user friendly name to show in documentation and UIs thus it is quite useful IMO.
github repository - of where the code can be found ("open-telemetry/opentelemetry-js-contrib", "open-telemetry/opentelemetry-js", or third party repos). It is currently found in the package.json for each instrumentation.
github path - the path inside the github repository where the code can be found. For example - plugins/node/instrumentation-amqplib. This info can potentially be extracted from the "homepage" attribute in package.json.
stability status
semantic conventions version implementation
emitted signals

and more info that we might need oneday...

Essentially, any information that might be useful for users to consume through various interfaces (documentation, README, UI, links, status) in its raw format

Usages

Here are a few practical applications of how this metadata can be effectively utilized:

distributions tools, to create automatic READMEs, docs, and any markdown file, where the content is auto generated based on this data. See auto-instrumentations-node README. The instrumentations list can be auto-generated, and include more info to the user, like the instrumentation description, instrumented package names and supported versions, as well as a link to the homepage. This can enhance the user experience of our contrib distribution users, which can also be leveraged by other third party distributions. Auto-generated text reduce mistakes, maintenance, promote consistent content and is less prone to get out of sync.
OpenTelemetry control planes - If an OpenTelemetry control plane displays information about the components at runtime (via UI, files, or databases), details like the instrumented package can be useful for user-facing interfaces.
Enhancements for UIs - providing enriched information about instrumentation can significantly improve the user experience when interacting with these details

Suggestion

I want to suggest aggregating the metadata to achieve the goals above. I can work on the relevant PRs to implement something if there is an agreement. I will start with just the info we already have available, and then introduce a script for the auto-instrumentations-node README auto-generation and enhancement. Additionally, I plan to utilize this data for the odigos distribution of js agent to auto-generate a Node.js section in the Odigos documentation and potentially report back instrumentation statuses to the Odigos control plane based on this data.

Some objectives to consider:

bundle size for web packages
programatic API to access the data, which does not include parsing markdown, heuristics on naming or exception tables.
auto generate it when possible, see https://github.com/open-telemetry/opentelemetry-js-contrib/pull/2203
making sure we are typescript-friendly for future additions and changes to this interface
nice to have: all the data in a single interface
nice to have: make the information available at runtime from the instrumentation class.

Options

The simplest and straight forward way would be to add this data to instrumentation interface, and then have each instrumentation setting it up:

as constructor argument, similar to instrumentation name and version which are already passed this way
as a function that instrumentation can override and return a metadata object, like the current init() function for patched packages info.
by defining an optional property from the base class which will expose this data on instrumentation instances.

If we decide to proceed this way, we must address TypeScript compatibility issues across versions to ensure that adding new properties does not introduce complexity.

Consider omitting it from web components at the moment so not to increase bundle size.

save this data as a json file for each package, and publish it to npm alongside the instrumentations. Then tools can maybe pick the node_modules folder to extract this info from code, and remote users can git pull to the tag or make an http request to fetch the data when needed. See Collector metadata.yaml as an inspiration.

Considerations

many of these fields can be auto-generated and are not a burden to the implementations (github repo, github path, description)
some of the data is already available in the README and can be documented into a json file where it can be consumed easily.
It makes sense to me that if we already record such data, we might want to make sure it can now or one day, potentially be uses for other components like detectors, propagators, processors, samplers, etc.

I think that once we come up with a good way to record this info, introducing it to existing components is a relatively simple technical task which I am up for doing.

I would appreciate your thoughts, concerns, suggestions or support, to help make this initiative a success!

May 21 '24 18:05 blumamir

opentelemetry-js opentelemetry-js copied to clipboard

How to make metadata accessible

Metadata Examples

Usages

Suggestion

Options

Considerations

opentelemetry-js
opentelemetry-js copied to clipboard