opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

New component: AWS ApplicationSignals Processor

Open mxiamxia opened this issue 1 year ago • 2 comments
trafficstars

The purpose and use-cases of the new component

Amazon CloudWatch ApplicationSignals utilizes the OTel Auto-instrumentation SDKs to automatically instrument applications running on AWS, and generates the custom application metrics, traces and log to monitor the application health and track long-term application performance. Currently, the generated telemetry data are processed by CloudWatch Agent before being sent to the AWS backend.

This proposal is to contribute ApplicationSignals components in CloudWatch Agent to OpenTelemetry Collector community.

The main functionalities

  1. High Cardinality Metrics Protection which helps users to cap the total number of unique metrics for their services before sending it to the destination.
  2. AWS Platform related telemetry attributes enrichment

Example configuration for the component

awsappsignals:
        limiter:
            disabled: false
            drop_threshold: 5000
            log_dropped_metrics: true
            rotation_interval: 10m0s
        resolvers:
            - name: app-signals
              platform: eks
        rules: []

Telemetry data types supported

  1. Traces
  2. Metrics
  3. Logs

Is this a vendor-specific component?

  • [X] This is a vendor-specific component
  • [X] If this is a vendor-specific component, I am proposing to contribute and support it as a representative of the vendor.

Code Owner(s)

mxiamxia@; bjrara@

Sponsor (optional)

Additional context

No response

mxiamxia avatar May 01 '24 20:05 mxiamxia

Added this component to the Collector SIG agenda as a vendor proposed component, the next maintainer on the rotating sponsor list should pick this up. cc @crobert-1

codeboten avatar May 08 '24 14:05 codeboten

Thank you! Missed the collector SIG discussions for this week. Will join the discussions next week and provide more details about the proposal.

mxiamxia avatar May 09 '24 17:05 mxiamxia

High Cardinality Metrics Protection which helps users to cap the total number of unique metrics for their services before sending it to the destination.

Could we consider achieving this in a non vendor specific way?

jeromeinsf avatar May 10 '24 18:05 jeromeinsf

AWS Platform related telemetry attributes enrichment

Can we detail this?

jeromeinsf avatar May 10 '24 18:05 jeromeinsf

Will keep design updated in https://docs.google.com/document/d/1YybdUN2__QL7mSF86ZBCN7AL4W5eXDgyoVzEGCpwz58/edit?usp=sharing

mxiamxia avatar May 15 '24 00:05 mxiamxia

Thx @mxiamxia I think it would be beneficial to detail what is AWS specific in the 4 components of this processor, and why a combination of the existing processors cannot achieve the same results

jeromeinsf avatar May 15 '24 14:05 jeromeinsf

@bryan-aguilar We had discussed this in a couple collector sig meetings, are you able to sponsor this?

crobert-1 avatar Jun 07 '24 18:06 crobert-1

High Cardinality Metrics Protection which helps users to cap the total number of unique metrics for their services before sending it to the destination.

Could we consider achieving this in a non vendor specific way?

Hi @jeromeinsf , sorry for the late response. The MetricLimiter coming with this component primarily having 2 functions - 1) group the metrics on a list of specific metric attributes, then count and sort the occurrences of each grouped metrics using Count-Min Sketch(CMS). 2) take the actions to the metrics having less occurrences found in CMS when the cardinality threshold limit is met. Currently, these 2 piece of functions are implemented in a very specific way based on the customer experience designed for AppSignals. With extra efforts, I think it is possible to abstract some pieces into general purpose. With the current proposal, we probably want to defer this efforts and it requires a follow up with community for the further discussion on the abstractions.

mxiamxia avatar Jun 11 '24 22:06 mxiamxia

Thx @mxiamxia I think it would be beneficial to detail what is AWS specific in the 4 components of this processor, and why a combination of the existing processors cannot achieve the same results

Thx @mxiamxia I think it would be beneficial to detail what is AWS specific in the 4 components of this processor, and why a combination of the existing processors cannot achieve the same results

Regarding 4 components listed, AppSignals users can leverage a list of existing processors including attributesprocessor, spanprocessor, k8sprocessor and spanprocessor and then configure it in certain way for processing AppSignals data to fulfill partial of requirements for ApplicationSignals, and some new things introduced like EKS/K8s Pod IP resolver, MetricLimiter will still be needed by introducing this new component. So we build on inclusive vendor specific component to achieve all these requirement so users won't need to worry about any configuration things.

mxiamxia avatar Jun 11 '24 23:06 mxiamxia

Hey @mxiamxia,

Do you mind organising a sponsor for this component so that subsequent PRs can flow through.

MovieStoreGuy avatar Jul 01 '24 11:07 MovieStoreGuy

We will need to find a new Sponsor. @bryan-aguilar from AWS is not available anymore to help sponsor this. We are fine with getting a new sponsor not from AWS. We will help ensure that PRs for this new processor will still be reviewed/owned from AWS representatives with commitment.

jj22ee avatar Jul 01 '24 20:07 jj22ee

Hi @crobert-1, would you be free to be the sponsor for this component? As mentioned earlier, @bryan-aguilar isn't available to sponsor this.

As this component is vendor-specific for AWS ApplicationSignals, @mxiamxia, @bjrara, and I will help with reviewing/supporting the contributions to this component as representatives.

jj22ee avatar Jul 03 '24 08:07 jj22ee

Hello @jj22ee, I recently sponsored a different vendor-specific component, so I've moved to the back of the rotating sponsors list.

I see a few different contributors here proposing to work on this, are any of you OpenTelemetry project members? I ask because if the contributor is a member we'll automatically assign a sponsor. Otherwise, you may need to find a sponsor, or become an OpenTelemetry member. Here are the guidelines for contributing new vendor-specific components.

crobert-1 avatar Jul 03 '24 15:07 crobert-1

@mxiamxia is a member of open-telemetry (checking from - https://github.com/orgs/open-telemetry/people). In this case, it sounds like we'll need an automatically assigned sponsor.

jj22ee avatar Jul 03 '24 16:07 jj22ee

As discussed in a Collector SIG meeting, @djaglowski could be the sponsor for this component. Much appreciated!!

jj22ee avatar Jul 03 '24 17:07 jj22ee

Why isn't this being added to the ADOT collector?

djaglowski avatar Jul 03 '24 18:07 djaglowski

This component is not for ADOT only. The reason to add this component into upstream, is so that existing consumers of any OTel collector (non-ADOT) will be able to use ApplicationSignals with their current OTel setup.

jj22ee avatar Jul 05 '24 17:07 jj22ee

This component is not for ADOT only. The reason to add this component into upstream, is so that existing consumers of any OTel collector (non-ADOT) will be able to use ApplicationSignals with their current OTel setup.

Generally, hosting a component upstream isn't necessary in order to allow others to pull it into any other OTel collector. As long as the component's go module is publicly accessible it should be possible. If this is the only reason then is it worth having the community take this on as an obligation?

The MetricLimiter coming with this component primarily having 2 functions - 1) group the metrics on a list of specific metric attributes, then count and sort the occurrences of each grouped metrics using Count-Min Sketch(CMS). 2) take the actions to the metrics having less occurrences found in CMS when the cardinality threshold limit is met. Currently, these 2 piece of functions are implemented in a very specific way based on the customer experience designed for AppSignals. With extra efforts, I think it is possible to abstract some pieces into general purpose. With the current proposal, we probably want to defer this efforts and it requires a follow up with community for the further discussion on the abstractions. ... Regarding 4 components listed, AppSignals users can leverage a list of existing processors including attributesprocessor, spanprocessor, k8sprocessor and spanprocessor and then configure it in certain way for processing AppSignals data to fulfill partial of requirements for ApplicationSignals, and some new things introduced like EKS/K8s Pod IP resolver, MetricLimiter will still be needed by introducing this new component. So we build on inclusive vendor specific component to achieve all these requirement so users won't need to worry about any configuration things.

I'm not sold on the idea that there's anything vendor-specific here other than configuration settings. Is it fair to say that this proposal could be separated into two parts?

  1. Opinionated configuration for three existing components.
  2. A new MetricLimiter processor, which could be generic, but would be easier to implement with the same opinionated assumptions used in a reference implementation.

djaglowski avatar Jul 08 '24 15:07 djaglowski

This component is not for ADOT only. The reason to add this component into upstream, is so that existing consumers of any OTel collector (non-ADOT) will be able to use ApplicationSignals with their current OTel setup.

Generally, hosting a component upstream isn't necessary in order to allow others to pull it into any other OTel collector. As long as the component's go module is publicly accessible it should be possible. If this is the only reason then is it worth having the community take this on as an obligation?

Thanks. Another reason is that we want these components to be more discoverable for Otel users. IMHO, I think OTel-contrib repo was designed for this purpose, allowing vendors to contribute their components. AWS has most of its Otel components in the contrib repos, so we would like to include this one there as well. :)

mxiamxia avatar Jul 25 '24 16:07 mxiamxia

I'm not sold on the idea that there's anything vendor-specific here other than configuration settings. Is it fair to say that this proposal could be separated into two parts? Opinionated configuration for three existing components.

We can't simply replace this component with the existing opinionated config because it involves very vendor-specific implementations. For example, we mutate telemetry attributes based on the detected AWS platform where the applications are running. We also plan to implement centralized telemetry data filter/replace/drop rules that can be retrieved from AWS, eliminating the need for customers to update their local config. These are part of reasons why we want to introduce this component.

Maybe go back your previous comment, for vendors introducing very business specific processors, is it common to contribute their processors to the contrib repo, or should vendors just host them in their own repo with public accessibility?

A new MetricLimiter processor, which could be generic, but would be easier to implement with the same opinionated assumptions used in a reference implementation.

Yes, we can make the metric limiter part generic for all Otel users. Initially, we were thinking of contributing it as is and then collaborating with the community to optimize it for general use. However, I am fine that we can hold off on MetricLimiter part for now and come up with a more general design later.

mxiamxia avatar Jul 25 '24 16:07 mxiamxia

It sounds like there may be a real need for such a component but I'm not going to take it on as an obligation. The concept of automatically accepting vendor-related components was never intended to apply to all vendor-specific use cases. It was intended to ensure no vendor is excluded at a basic level. We recently updated our guidelines to reflect this intention. Generally speaking, components should be accepted based on the capacity and judgement of the maintainers & approvers.

djaglowski avatar Jul 26 '24 16:07 djaglowski

Hi Daniel, we are committed to maintaining this component and addressing any issues that may come up. We will also follow the guideline and ensure all listed criteria are met. Appreciate it if you could help us on reviewing our ongoing and upcoming PRs.

mxiamxia avatar Jul 27 '24 01:07 mxiamxia

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

github-actions[bot] avatar Sep 25 '24 03:09 github-actions[bot]