extensions icon indicating copy to clipboard operation
extensions copied to clipboard

Usability concerns regarding logging enrichment and tags collision

Open NatMarchand opened this issue 1 year ago • 12 comments

Hi! I'm currently trying log enrichment on an app but I think there's a problem of usability. In the simple case where I have an AspNet server endpoint which makes a HttpClient request however all the tags collide for example http.request.header, http.response.header. In my logs I don't know if the header comes from the incoming request or outgoing request. I do understand that there are OTEL conventions however they seem to apply only for tracing not logs.

NatMarchand avatar Feb 15 '24 08:02 NatMarchand

@klauco @xakep139 Any insights to provide here?

geeknoid avatar Mar 07 '24 18:03 geeknoid

Hello @NatMarchand, thank you for reaching out. I agree that some tags share same names in HttpClient and HTTP logging components, but you can easily distinguish them using:

  • log category:
    • incoming logs will be emitted with Microsoft.AspNetCore.HttpLogging.HttpLoggingMiddleware category
    • outgoing logs will have Microsoft.Extensions.Http.Logging.HttpClientLogger category
  • log message:
    • incoming logs will start with Request and Response: message
    • outgoing logs will have format {httpMethod} {httpServer}/{HttpPath}, e.g. GET localhost/api/users
  • event id:
    • incoming logs have "EventId": 9
    • outgoing logs have "EventId": 1
  • other tags (e.g. incoming HTTP logs have Scheme and Protocol tags emitted by default, outgoing HTTP logs don't emit those)

Additionally, you can also add your custom enricher (one that implements IHttpLogEnricher or IHttpClientLogEnricher) to add other tags that will help you distinguish the logs.

@klauco any other thoughts here?

xakep139 avatar Mar 08 '24 10:03 xakep139

When adding AddRequestHeadersLogEnricher(), an enricher is registered and then all the logs in the scope of the incoming http request will get the headers of the incoming request as tags, including logs from the outgoing call. I can show you on a small repro.

image In the image above, the log is from the category Microsoft.Extensions.Http.Logging.HttpClientLogger (as you said). The http.response.status_code property is the status code from the outgoing call. The http.request.header.user-agent is the user agent from the incoming request. It's hard to know whether the http.request.* properties are coming from the outgoing call or the incoming request scope.

NatMarchand avatar Mar 21 '24 15:03 NatMarchand

@NatMarchand, I see where the confusion is coming from. If log data you shared came from HttpClientLogger - that means that it contains data about an outgoing call only. In the log tags http.request.* tags correspond to the HttpRequestMessage that was sent by an HttpClient. Respectively, all http.response.* tags show the data dedicated to the HttpResponseMessage that was received by the HttpClient. Does that explain you concerns?

xakep139 avatar Apr 02 '24 09:04 xakep139

If log data you shared came from HttpClientLogger - that means that it contains data about an outgoing call only.

That's not totally true. I've pinpointed the problem:

builder.Services
    .AddRequestHeadersLogEnricher(options =>
    {
#pragma warning disable EXTEXP0003
        options.HeadersDataClasses["User-Agent"] = DataClassification.None;
#pragma warning restore EXTEXP0003
    });

Issue the following log (with otel console exporter):

info: Microsoft.Extensions.Http.Logging.HttpClientLogger[1]
      GET bing.com/REDACTED
LogRecord.Timestamp:               2024-04-02T09:59:06.1827531Z
LogRecord.TraceId:                 0c5ba511208bc2d193a8f0197a394354
LogRecord.SpanId:                  eb8c401d9dfc55be
LogRecord.TraceFlags:              None
LogRecord.CategoryName:            Microsoft.Extensions.Http.Logging.HttpClientLogger
LogRecord.Severity:                Info
LogRecord.SeverityText:            Information
LogRecord.FormattedMessage:        GET bing.com/REDACTED
LogRecord.Body:                    GET bing.com/REDACTED
LogRecord.Attributes (Key:Value):
    http.request.method: GET
    server.address: bing.com
    url.path: REDACTED
    Duration: 487
    http.response.status_code: 200
    http.request.header.user-agent: Test/1.0
    http.request.header.user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36 Edg/123.0.0.0
LogRecord.EventId:                 1
LogRecord.EventName:               OutgoingRequest
LogRecord.ScopeValues (Key:Value):
[Scope.0]:SpanId: eb8c401d9dfc55be
[Scope.0]:TraceId: 0c5ba511208bc2d193a8f0197a394354
[Scope.0]:ParentId: 0000000000000000
[Scope.1]:ConnectionId: 0HN2IVE1H3S44
[Scope.2]:RequestId: 0HN2IVE1H3S44:00000005
[Scope.2]:RequestPath: /bing

Resource associated with LogRecord:
telemetry.sdk.name: opentelemetry
telemetry.sdk.language: dotnet
telemetry.sdk.version: 1.7.0
service.name: unknown_service:ReproTelemetry

See the http.request.header.user-agent added twice, one from my browser calling the server and the other from the httpclient outgoing. When using otlp exporter, only the last "survive"

Repro code
using Microsoft.Extensions.Compliance.Classification;
using Microsoft.Extensions.Compliance.Redaction;
using OpenTelemetry.Logs;

var builder = WebApplication.CreateBuilder(args);
builder.Logging
    .AddOpenTelemetry(otelBuilder =>
    {
        otelBuilder.AddConsoleExporter();
        otelBuilder.IncludeFormattedMessage = true;
        otelBuilder.IncludeScopes = true;
        otelBuilder.ParseStateValues = true;
    })
    .EnableEnrichment();
builder.Services
    .AddRequestHeadersLogEnricher(options =>
    {
#pragma warning disable EXTEXP0003
        options.HeadersDataClasses["User-Agent"] = DataClassification.None;
#pragma warning restore EXTEXP0003
    });
builder.Services
    .AddHttpClient("Bing", o =>
    {
        o.BaseAddress = new("https://bing.com");
        o.DefaultRequestHeaders.UserAgent.Add(new("Test", "1.0"));
    });
builder.Services
    .AddExtendedHttpClientLogging(o =>
    {
        o.RequestHeadersDataClasses.Add("User-Agent", DataClassification.None);
    });
builder.Services.AddSingleton<IRedactorProvider, NullRedactorProvider>();
var app = builder.Build();
app.MapGet("/", () => Results.Ok("hi"));
app.MapGet("/bing", (IHttpClientFactory httpClientFactory) => httpClientFactory.CreateClient("Bing").GetAsync(""));
app.Run();

NatMarchand avatar Apr 02 '24 10:04 NatMarchand

What is the expected behaviour -- completely omit the incoming HTTP request headers from the outgoing HTTP request logging, or include them with different names?

KalleOlaviNiemitalo avatar Apr 02 '24 10:04 KalleOlaviNiemitalo

If I had a magic wand, as I want to have both information, I'd make it on different prefix : http.request.* for incoming request attributes http_client.* or http.outgoing_request.* or something else for the outgoing request

NatMarchand avatar Apr 02 '24 13:04 NatMarchand

Okay, that happens because you added RequestHeadersLogEnricher - and it will augment all your logs emitted within the processing time of an incoming request, not only HTTP- or HttpClient-specific ones. Do you really need User-Agent to be added to all these logs (even coming from Microsoft.AspNetCore.Routing.EndpointMiddleware category)?

With regards to the magic wand, you can always create your custom processor that derives from BaseProcessor<LogRecord> and rename/remove any attributes you want.

xakep139 avatar Apr 02 '24 15:04 xakep139

Alternatively, you can use AddHttpLoggingRedaction() instead of AddRequestHeadersLogEnricher() - that way you'll get one nice log record per each incoming HTTP request without any ambiguity.

xakep139 avatar Apr 02 '24 16:04 xakep139

Do you really need User-Agent to be added to all these logs (even coming from Microsoft.AspNetCore.Routing.EndpointMiddleware category)?

Yes, it's what we find really interesting with enrichment: we have some headers that allow us to troubleshoot or correlate (we use UserAgent to know which of our service make the call or the cloudflare RayId). That avoid us to make a first call to find the TraceId from them and then filter on this (these) TraceId.

With regards to the magic wand, you can always create your custom processor that derives from BaseProcessor<LogRecord> and rename/remove any attributes you want.

I'm not sure if in the BaseProcessor I'm able to determine if such attribute comes from the enrichment of the request and delete it or from the outgoing call and keep it. In addition, we already have a BaseProcessor we'd like to get rid of which was added to workaround a bug between Compliance package and OTLP batch exporter (see https://github.com/open-telemetry/opentelemetry-dotnet/issues/5276)

Alternatively, you can use AddHttpLoggingRedaction() instead of AddRequestHeadersLogEnricher() - that way you'll get one nice log record per each incoming HTTP request without any ambiguity.

See my first point, what we found really interesting is to filter logs against headers without join.

NatMarchand avatar Apr 04 '24 08:04 NatMarchand

I would encourage you to rely on TraceId to correlate logs for a particular request. Regardless, @lmolkova are there any plans in terms of adding HTTP-specific tags in SemConv for logs?

xakep139 avatar Apr 04 '24 13:04 xakep139

All OTel attributes can be used on traces, metrics, or logs, we don't differentiate attributes by signal. We don't plan to add log-specific conventions and it would be hard to justify or push anyone to implement given all the prior art in the industry.

Here it seems we have an interesting issue - when enriching logs the new attributes clash with existing ones - I guess the enricher should be responsible for the conflict resolution and/or use dedicated namespace to add attributes to.

lmolkova avatar Apr 04 '24 18:04 lmolkova