opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

add redact for span name

Open ruimhu1201 opened this issue 1 year ago • 3 comments

Description: Added redact for span name in redactprocessor redaction processor is redact attribute but the span name is not redact. When we depends on this function to do credential redaction, the span name could still be the issue, therefore, add redaction on span name,

Testing: Test case is added to the unit test.

ruimhu1201 avatar May 09 '24 11:05 ruimhu1201

CLA Not Signed

Thanks for the contribution @ruimhu1201! Please make sure to sign the CLA and add a changelog for this change.

Can you give more details regarding the use case? I'm not sure I understand why a span name would include sensitive information

Thank you for reviewing, @codeboten. Here is an example: In the HTTP URL, some customers may utilize a JWT token as a username or group name, which forms part of the path. Although we have implemented a redaction pattern for all keys in the configuration file, and attributes have been successfully redacted, the name still inadvertently reveals the token. image

ruimhu1201 avatar May 10 '24 02:05 ruimhu1201

Although I agree that such a redaction may be useful, note that the token should not end up in the span name (see here and here at http.route), what HTTP library are you using here? This might be an implementation error.

svrnm avatar May 10 '24 08:05 svrnm

As an alternative, for unexpected situations like this, the transformprocessor can be used to modify span names.

The OTTL roadmap states that the redactionprocessor fills a significant role and that the transformprocessor is not the collector's solution for redaction, but I think limiting it's scope to redacting attributes is appropriate.

TylerHelmuth avatar May 10 '24 17:05 TylerHelmuth

Although I agree that such a redaction may be useful, note that the token should not end up in the span name (see here and here at http.route), what HTTP library are you using here? This might be an implementation error.

I totally understood that use token as part of the path is not a good practice, but it's the group and username which is input from customer and we don't have control over.

ruimhu1201 avatar May 11 '24 06:05 ruimhu1201

As an alternative, for unexpected situations like this, the transformprocessor can be used to modify span names.

The OTTL roadmap states that the redactionprocessor fills a significant role and that the transformprocessor is not the collector's solution for redaction, but I think limiting it's scope to redacting attributes is appropriate.

there's no harm for redactprocessor to provide full ability to redact all the pattern even with the span name, do you think it's better to have a flag which can select if custoemr need to redact name or not?

ruimhu1201 avatar May 11 '24 06:05 ruimhu1201

we don't have control over

Can you clarify how you don't have control over how this data ends up on the span name? Customers can always manually name spans whatever they want -- social security numbers, credit card numbers, auth tokens, etc. -- but that's squarely in the misuse category. If we should enable filtering data from misuse, that's a separate discussion IMO.

cartermp avatar May 13 '24 17:05 cartermp

we don't have control over

Can you clarify how you don't have control over how this data ends up on the span name? Customers can always manually name spans whatever they way -- social security numbers, credit card numbers, auth tokens, etc. -- but that's squarely in the misuse category. If we should enable filtering data from misuse, that's a separate discussion IMO.

I understand your concerns and appreciate the opportunity to clarify. As a platform offering real-time services, we empower our clients by allowing them to customize their code integration, which includes specifying URL paths. While we recognize that this flexibility can lead to misuse, it is beyond our purview to dictate customer practices. Nonetheless, we adhere to Microsoft's stringent Azure service guidelines to prevent any inadvertent credential exposure. It is our duty to redact such sensitive information from logs, ensuring compliance with these standards and the protection of our customers' data. We take this responsibility seriously and are committed to upholding the highest security measures.

ruimhu1201 avatar May 16 '24 05:05 ruimhu1201

@ruimhu1201 your PR raises much larger questions around the processor/collector and security, and I'd prefer we approach the topic holistically instead of in an isolated way.

A decision needs to be made (in a new issue, not in this PR), about the future of the redactionprocessor and its purpose in the OpenTelemetry ecosystem. Questions that need answered:

  1. What is the the redactionprocessor responsibility within the OpenTelemetry ecosystem
  2. On what signals should the redaction processor operate
  3. On what fields within the OTLP payload should the redaction operator interact
  4. How configurable should the redactionprocessor be
    • Are there fields that are automatically redacted when the processor is used but others are opt-in?
  5. Should redactions be allow to be conditional

Once decisions are made for those types of questions we need to make implementation decisions like:

  1. Should the above be handled solely by the transformprocessor
  2. Should the redactionprocessor utilize OTTL or implement its own configuration for expressing what to redact

There are probably more decisions to make that will arise once the discussion starts. As a code owner for the redaction processor I do not feel comfortable adding support for other fields until this discussion has happened as it will lead, as we saw in the filterprocessor, to a mismatch of rules/features between signals. I will also admit that I have not been prioritizing this processor, and will struggle to do so while trying to move the Collector to 1.0. We would love to collaborate with you if you and your organization sees this as a priority.

In the meantime the transformprocessor can act as a replacement for the redactionprocessor for non-attribute fields since it allows transforming any field on the payload.

TylerHelmuth avatar May 16 '24 20:05 TylerHelmuth

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions[bot] avatar May 31 '24 05:05 github-actions[bot]

Closed as inactive. Feel free to reopen if this PR is still being worked on.

github-actions[bot] avatar Jun 15 '24 05:06 github-actions[bot]