Usability concerns regarding logging enrichment and tags collision
Hi! I'm currently trying log enrichment on an app but I think there's a problem of usability. In the simple case where I have an AspNet server endpoint which makes a HttpClient request however all the tags collide for example http.request.header, http.response.header. In my logs I don't know if the header comes from the incoming request or outgoing request. I do understand that there are OTEL conventions however they seem to apply only for tracing not logs.
@klauco @xakep139 Any insights to provide here?
Hello @NatMarchand, thank you for reaching out. I agree that some tags share same names in HttpClient and HTTP logging components, but you can easily distinguish them using:
- log category:
- incoming logs will be emitted with
Microsoft.AspNetCore.HttpLogging.HttpLoggingMiddlewarecategory - outgoing logs will have
Microsoft.Extensions.Http.Logging.HttpClientLoggercategory
- incoming logs will be emitted with
- log message:
- incoming logs will start with
Request and Response:message - outgoing logs will have format
{httpMethod} {httpServer}/{HttpPath}, e.g.GET localhost/api/users
- incoming logs will start with
- event id:
- incoming logs have
"EventId": 9 - outgoing logs have
"EventId": 1
- incoming logs have
- other tags (e.g. incoming HTTP logs have
SchemeandProtocoltags emitted by default, outgoing HTTP logs don't emit those)
Additionally, you can also add your custom enricher (one that implements IHttpLogEnricher or IHttpClientLogEnricher) to add other tags that will help you distinguish the logs.
@klauco any other thoughts here?
When adding AddRequestHeadersLogEnricher(), an enricher is registered and then all the logs in the scope of the incoming http request will get the headers of the incoming request as tags, including logs from the outgoing call. I can show you on a small repro.
In the image above, the log is from the category Microsoft.Extensions.Http.Logging.HttpClientLogger (as you said). The
http.response.status_code property is the status code from the outgoing call. The http.request.header.user-agent is the user agent from the incoming request. It's hard to know whether the http.request.* properties are coming from the outgoing call or the incoming request scope.
@NatMarchand, I see where the confusion is coming from.
If log data you shared came from HttpClientLogger - that means that it contains data about an outgoing call only. In the log tags http.request.* tags correspond to the HttpRequestMessage that was sent by an HttpClient. Respectively, all http.response.* tags show the data dedicated to the HttpResponseMessage that was received by the HttpClient.
Does that explain you concerns?
If log data you shared came from HttpClientLogger - that means that it contains data about an outgoing call only.
That's not totally true. I've pinpointed the problem:
builder.Services
.AddRequestHeadersLogEnricher(options =>
{
#pragma warning disable EXTEXP0003
options.HeadersDataClasses["User-Agent"] = DataClassification.None;
#pragma warning restore EXTEXP0003
});
Issue the following log (with otel console exporter):
info: Microsoft.Extensions.Http.Logging.HttpClientLogger[1]
GET bing.com/REDACTED
LogRecord.Timestamp: 2024-04-02T09:59:06.1827531Z
LogRecord.TraceId: 0c5ba511208bc2d193a8f0197a394354
LogRecord.SpanId: eb8c401d9dfc55be
LogRecord.TraceFlags: None
LogRecord.CategoryName: Microsoft.Extensions.Http.Logging.HttpClientLogger
LogRecord.Severity: Info
LogRecord.SeverityText: Information
LogRecord.FormattedMessage: GET bing.com/REDACTED
LogRecord.Body: GET bing.com/REDACTED
LogRecord.Attributes (Key:Value):
http.request.method: GET
server.address: bing.com
url.path: REDACTED
Duration: 487
http.response.status_code: 200
http.request.header.user-agent: Test/1.0
http.request.header.user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36 Edg/123.0.0.0
LogRecord.EventId: 1
LogRecord.EventName: OutgoingRequest
LogRecord.ScopeValues (Key:Value):
[Scope.0]:SpanId: eb8c401d9dfc55be
[Scope.0]:TraceId: 0c5ba511208bc2d193a8f0197a394354
[Scope.0]:ParentId: 0000000000000000
[Scope.1]:ConnectionId: 0HN2IVE1H3S44
[Scope.2]:RequestId: 0HN2IVE1H3S44:00000005
[Scope.2]:RequestPath: /bing
Resource associated with LogRecord:
telemetry.sdk.name: opentelemetry
telemetry.sdk.language: dotnet
telemetry.sdk.version: 1.7.0
service.name: unknown_service:ReproTelemetry
See the http.request.header.user-agent added twice, one from my browser calling the server and the other from the httpclient outgoing. When using otlp exporter, only the last "survive"
Repro code
using Microsoft.Extensions.Compliance.Classification;
using Microsoft.Extensions.Compliance.Redaction;
using OpenTelemetry.Logs;
var builder = WebApplication.CreateBuilder(args);
builder.Logging
.AddOpenTelemetry(otelBuilder =>
{
otelBuilder.AddConsoleExporter();
otelBuilder.IncludeFormattedMessage = true;
otelBuilder.IncludeScopes = true;
otelBuilder.ParseStateValues = true;
})
.EnableEnrichment();
builder.Services
.AddRequestHeadersLogEnricher(options =>
{
#pragma warning disable EXTEXP0003
options.HeadersDataClasses["User-Agent"] = DataClassification.None;
#pragma warning restore EXTEXP0003
});
builder.Services
.AddHttpClient("Bing", o =>
{
o.BaseAddress = new("https://bing.com");
o.DefaultRequestHeaders.UserAgent.Add(new("Test", "1.0"));
});
builder.Services
.AddExtendedHttpClientLogging(o =>
{
o.RequestHeadersDataClasses.Add("User-Agent", DataClassification.None);
});
builder.Services.AddSingleton<IRedactorProvider, NullRedactorProvider>();
var app = builder.Build();
app.MapGet("/", () => Results.Ok("hi"));
app.MapGet("/bing", (IHttpClientFactory httpClientFactory) => httpClientFactory.CreateClient("Bing").GetAsync(""));
app.Run();
What is the expected behaviour -- completely omit the incoming HTTP request headers from the outgoing HTTP request logging, or include them with different names?
If I had a magic wand, as I want to have both information, I'd make it on different prefix : http.request.* for incoming request attributes http_client.* or http.outgoing_request.* or something else for the outgoing request
Okay, that happens because you added RequestHeadersLogEnricher - and it will augment all your logs emitted within the processing time of an incoming request, not only HTTP- or HttpClient-specific ones.
Do you really need User-Agent to be added to all these logs (even coming from Microsoft.AspNetCore.Routing.EndpointMiddleware category)?
With regards to the magic wand, you can always create your custom processor that derives from BaseProcessor<LogRecord> and rename/remove any attributes you want.
Alternatively, you can use AddHttpLoggingRedaction() instead of AddRequestHeadersLogEnricher() - that way you'll get one nice log record per each incoming HTTP request without any ambiguity.
Do you really need User-Agent to be added to all these logs (even coming from Microsoft.AspNetCore.Routing.EndpointMiddleware category)?
Yes, it's what we find really interesting with enrichment: we have some headers that allow us to troubleshoot or correlate (we use UserAgent to know which of our service make the call or the cloudflare RayId). That avoid us to make a first call to find the TraceId from them and then filter on this (these) TraceId.
With regards to the magic wand, you can always create your custom processor that derives from BaseProcessor<LogRecord> and rename/remove any attributes you want.
I'm not sure if in the BaseProcessor I'm able to determine if such attribute comes from the enrichment of the request and delete it or from the outgoing call and keep it. In addition, we already have a BaseProcessor we'd like to get rid of which was added to workaround a bug between Compliance package and OTLP batch exporter (see https://github.com/open-telemetry/opentelemetry-dotnet/issues/5276)
Alternatively, you can use AddHttpLoggingRedaction() instead of AddRequestHeadersLogEnricher() - that way you'll get one nice log record per each incoming HTTP request without any ambiguity.
See my first point, what we found really interesting is to filter logs against headers without join.
I would encourage you to rely on TraceId to correlate logs for a particular request.
Regardless, @lmolkova are there any plans in terms of adding HTTP-specific tags in SemConv for logs?
All OTel attributes can be used on traces, metrics, or logs, we don't differentiate attributes by signal. We don't plan to add log-specific conventions and it would be hard to justify or push anyone to implement given all the prior art in the industry.
Here it seems we have an interesting issue - when enriching logs the new attributes clash with existing ones - I guess the enricher should be responsible for the conflict resolution and/or use dedicated namespace to add attributes to.