opentelemetry-dotnet
opentelemetry-dotnet copied to clipboard
[sdk] Add OpenTelemetrySdk builder pattern
Relates to #5137 Relates to #5265 Relates to #5648
Changes
- Adds
OpenTelemetrySdkAPI.
Details
#5265 laid the groundwork for cross-cutting extensions (ex #5400). The issue is (currently) only users of OpenTelemetry.Extensions.Hosting will be able to utilize those. What this PR does is add a pattern in SDK for non-hosting users (primarily .NET Framework) to utilize all the cross-cutting things. It also has the benefit of potentially unifying our startup/configuration story/documentation so everything uses the same "With" style (see https://github.com/open-telemetry/opentelemetry-dotnet/pull/5262#discussion_r1468170598).
We have already seen distros working around this gap by rolling their own custom builders for example: https://github.com/elastic/elastic-otel-dotnet/blob/main/src/Elastic.OpenTelemetry/ElasticOpenTelemetryBuilder.cs
Usage
New style:
using var openTelemetry = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyService"))
.UseOtlpExporter());
// To access the providers:
// openTelemetry.LoggerProvider
// openTelemetry.MeterProvider
// openTelemetry.TracerProvider
// To get the ILoggerFactory:
// openTelemetry.GetLoggerFactory();
Old style:
var resourceBuilder = ResourceBuilder.CreateDefault().AddService("MyService");
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
.SetResourceBuilder(resourceBuilder)
.AddOtlpExporter()
.Build();
using var meterProvider = Sdk.CreateMeterProviderBuilder()
.SetResourceBuilder(resourceBuilder)
.AddOtlpExporter()
.Build();
using var loggerFactory = LoggerFactory.Create(logging =>
{
logging.AddOpenTelemetry(builder => builder
.SetResourceBuilder(resourceBuilder)
.AddOtlpExporter());
});
// Experimental API
using var loggerProvider = Sdk.CreateLoggerProviderBuilder()
.SetResourceBuilder(resourceBuilder)
.AddOtlpExporter()
.Build();
Migration story enabled for LoggerFactory.Create to LoggerProviderBuilder
#5648 brought LoggerProviderBuilder & WithLogging into the stable public API. Users calling LoggerFactory.Create today don't currently have an API which exposes LoggerProviderBuilder. The API introduced on this PR can be used to migrate users calling LoggerFactory.Create to the WithLogging style:
Before:
using var loggerFactory = LoggerFactory.Create(logging =>
{
logging.AddOpenTelemetry(builder => builder
.AddOtlpExporter()); // Calling extension on OpenTelemetryLoggerOptions
});
After:
using var openTelemetry = OpenTelemetrySdk.Create(builder => builder
.WithLogging(logging => logging.AddOtlpExporter())); // Calling extension on LoggerProviderBuilder
var loggerFactory = openTelemetry.GetLoggerFactory();
Public API changes
namespace OpenTelemetry;
public sealed class OpenTelemetrySdk : IDisposable
{
public LoggerProvider LoggerProvider { get; }
public MeterProvider MeterProvider { get; }
public TracerProvider TracerProvider { get; }
public void Dispose() {}
}
public static class OpenTelemetrySdkExtensions
{
public static ILoggerFactory GetLoggerFactory(this OpenTelemetrySdk sdk) {}
}
Merge requirement checklist
- [X] CONTRIBUTING guidelines followed (license requirements, nullable enabled, static analysis, etc.)
- [X] Unit tests added/updated
- [X] Appropriate
CHANGELOG.mdfiles updated for non-trivial changes - [ ] Changes in public API reviewed (if applicable)
Could you explain how this will be handled
using var openTelemetry = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyService"))
.AddOtlpExporter() // (Future thing)
.WithTracing(builder => builder.AddOtlpExporter(otlpOptions => {}));
Could you explain how this will be handled
using var openTelemetry = OpenTelemetrySdk.Create(builder => builder .ConfigureResource(r => r.AddService("MyService")) .AddOtlpExporter() // (Future thing) .WithTracing(builder => builder.AddOtlpExporter(otlpOptions => {}));
Are you asking about the options? Or just two calls to "AddOtlpExporter" in there? I'm going to assume the later.
@alanwest Raised the same question here: https://github.com/open-telemetry/opentelemetry-dotnet/pull/5137#issuecomment-1894452554
This isn't the PR for that discussion, really. We'll work through that when we add the AddOtlpExporter cross-cutting extension.
But my current thinking is: It is a non-issue 🤣
It will be handled IMO the same way this is handled...
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
.AddOtlpExporter()
.AddOtlpExporter()
.Build();
You will get two exporters. There are perfectly valid reasons to do that so I don't think we should assume anything or guess what the user wants. But we can work through it and discuss it more when we get there.
Could you explain how this will be handled
using var openTelemetry = OpenTelemetrySdk.Create(builder => builder .ConfigureResource(r => r.AddService("MyService")) .AddOtlpExporter() // (Future thing) .WithTracing(builder => builder.AddOtlpExporter(otlpOptions => {}));
@CodeBlanch I have a general question regarding the design, could you summarize 1) what you would like to achieve and 2) the design principles? E.g. each signal allows the user to add plugins, how these plugins work with "global" plugins? (basically, what rules do we follow)
Take one concrete example (note that this is just to facilitate the discussion, we won't be able to enumerate all scenarios, a set of rules are needed here):
I want to have OTLP exporter for all 3 signals, however, I want to set the metrics exporting frequency to 10 seconds, and logs/traces to 1 minute. I also want to put my custom redaction processor before the OTLP log exporter.
@CodeBlanch I have a general question regarding the design, could you summarize 1) what you would like to achieve and 2) the design principles?
What I want to achieve is primarily the simplification of the startup and teardown code. One call to make which initializes the SDK and one call to make to dispose/shutdown the SDK. The design principle is "crawl, walk, run" is there a name for that? 🤣 The secondary goal is to simplify docs and examples using this.
Take one concrete example (note that this is just to facilitate the discussion, we won't be able to enumerate all scenarios, a set of rules are needed here):
I want to have OTLP exporter for all 3 signals, however, I want to set the metrics exporting frequency to 10 seconds, and logs/traces to 1 minute. I also want to put my custom redaction processor before the OTLP log exporter.
This PR is NOT adding a cross-cutting AddOtlpExporter method 🚫 I really don't want to wade into that. We'll have to solution that out and it may never manifest. What this PR is doing is allowing users to do this:
// New
using var sdk = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyApp"))
.WithLogging(builder => builder.AddOtlpExporter())
.WithMetrics(builder => builder.AddOtlpExporter())
.WithTracing(builder => builder.AddOtlpExporter()));
// Old
var resource = ResourceBuilder.CreateDefault()
.AddService("MyApp");
using var loggingProvider = Sdk.CreateLoggerProviderBuilder()
.SetResourceBuilder(resource)
.AddOtlpExporter()
.Build();
using var meterProdier = Sdk.CreateMeterProviderBuilder()
.SetResourceBuilder(resource)
.AddOtlpExporter()
.Build();
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
.SetResourceBuilder(resource)
.AddOtlpExporter()
.Build();
It is a convenience thing allowing the "manual" bootstrap pattern to utilize the "With" pattern we have in hosting. The main benefit is it allows you to track a single thing to dispose. Seems minor, but most code I see does NOT clean up things correctly 😭
Scenarios
Since you brought up scenarios let's imagine where we could take this.
Crawl: I want to have OTLP exporter for all 3 signals
[My guess is this scenario is what most users want to do.]
With this PR we don't have a cross-cutting AddOtlpExporter so you do this:
using var sdk = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyApp"))
.WithLogging(builder => builder.AddOtlpExporter())
.WithMetrics(builder => builder.AddOtlpExporter())
.WithTracing(builder => builder.AddOtlpExporter()));
How would it work once we have a cross-cutting method?
using var sdk = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyApp"))
.AddOtlpExporter()
.WithLogging()
.WithMetrics()
.WithTracing());
Walk: I want to have OTLP exporter for all 3 signals sending to different endpoints
With this PR we don't have a cross-cutting AddOtlpExporter so you do this:
using var sdk = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyApp"))
.WithLogging(builder => builder.AddOtlpExporter(o => o.Endpoint = new("http://mylogendpoint/logs")))
.WithMetrics(builder => builder.AddOtlpExporter(o => o.Endpoint = new("http://mymetricendpoint/metrics")))
.WithTracing(builder => builder.AddOtlpExporter(o => o.Endpoint = new("http://mytraceendpoint/traces"))));
How would it work once we have a cross-cutting method? Any number of ways. It depends on what options we expose, the API shape, etc. It could look like this:
using var sdk = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyApp"))
.AddOtlpExporter(options => {
options.LoggingOptions.Endpoint = new("http://mylogendpoint/logs");
options.MetricsOptions.Endpoint = new("http://mymetricendpoint/metrics");
options.TracingOptions.Endpoint = new("http://mytraceendpoint/trace");
})
.WithLogging()
.WithMetrics()
.WithTracing());
Run: I want to have OTLP exporter for all 3 signals, however, I want to set the metrics exporting frequency to 10 seconds, and logs/traces to 1 minute. I also want to put my custom redaction processor before the OTLP log exporter.
Different ways you could achieve this. If we had a simple AddOtlpExporter cross-cutting extension, you could do this:
using var sdk = OpenTelemetrySdk.Create(builder =>
{
builder.Services.Configure<MetricReaderOptions>(o => o.PeriodicExportingMetricReaderOptions.ExportIntervalMilliseconds = 10_000);
builder.Services.Configure<BatchExportActivityProcessorOptions>(o => o.ScheduledDelayMilliseconds = 60_000);
builder.Services.Configure<LogRecordExportProcessorOptions>(o => o.BatchExportProcessorOptions.ScheduledDelayMilliseconds = 60_000);
builder
.ConfigureResource(r => r.AddService("MyApp"))
.WithLogging(logging => logging.AddProcessor(new CustomLogRedactionProcessor()))
.WithMetrics()
.WithTracing()
.AddOtlpExporter();
});
Registration order
E.g. each signal allows the user to add plugins, how these plugins work with "global" plugins? (basically, what rules do we follow)
Essentially it works the same way we have it today when using the hosting package. Nothing here is really new. The code I have above:
builder
.ConfigureResource(r => r.AddService("MyApp"))
.WithLogging(logging => logging.AddProcessor(new CustomLogRedactionProcessor()))
.WithMetrics()
.WithTracing()
.AddOtlpExporter();
Is just orchestrating these calls:
services.ConfigureOpenTelemetryLoggerProvider(builder => builder.ConfigureResource(r => r.AddService("MyApp")));
services.ConfigureOpenTelemetryMeterProvider(builder => builder.ConfigureResource(r => r.AddService("MyApp")));
services.ConfigureOpenTelemetryTracerProvider(builder => builder.ConfigureResource(r => r.AddService("MyApp")));
services.ConfigureOpenTelemetryLoggerProvider(builder => builder.AddProcessor(new CustomLogRedactionProcessor()));
services.ConfigureOpenTelemetryLoggerProvider(builder => builder.AddOtlpExporter());
services.ConfigureOpenTelemetryMeterProvider(builder => builder.AddOtlpExporter());
services.ConfigureOpenTelemetryTracerProvider(builder => builder.AddOtlpExporter());
The order in which they are called is the order in which they are applied.
Same ordering requirements are actually in play for the APIs we have today...
This user is happy:
using var loggerFactory = LoggerFactory.Create(builder => builder
.AddOpenTelemetry(options => options
.AddProcessor(new CustomLogRedactionProcessor()))
.AddOtlpExporter());
This user is sad:
using var loggerFactory = LoggerFactory.Create(builder => builder
.AddOpenTelemetry(options => options
.AddOtlpExporter()
.AddProcessor(new CustomLogRedactionProcessor())));
This PR is NOT adding a cross-cutting
AddOtlpExportermethod 🚫 I really don't want to wade into that. We'll have to solution that out and it may never manifest.
This is exactly why I think it is unclear, the PR description mentioned:
using var openTelemetry = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyService"))
.AddOtlpExporter() // (Future thing)
.AddConsoleExporter() // (Future thing)
.WithTracing()
.WithLogging() // (Experimental API)
.WithMetrics());
// To access the providers:
// openTelemetry.LoggerProvider (Experimental API)
// openTelemetry.MeterProvider
// openTelemetry.TracerProvider
// To get the ILoggerFactory:
// openTelemetry.GetLoggerFactory();
What I want - the overall goal/direction, I cannot review a concrete PR without first understanding what's the overall goal, if the PR is adding a new way of doing things. Specifically, I'm concerned that we end up with a set of incomplete stories "as a user, you can choose path A, B or C, and all of them come with caveats that you should know ahead of time".
This PR is NOT adding a cross-cutting AddOtlpExporter method 🚫 I really don't want to wade into that. We'll have to solution that out and it may never manifest.
This brings me some pause as well. My original impression was that manifesting this was the north star for all this work. If it remains uncertain whether this goal will be achievable, then I agree more thought and design is required before we commit to landing anything.
This may entail reevaluating the work already landed in #5265 - at a bare minimum, it sounds like the new APIs introduced in #5265 should be marked experimental, so that in the event we don't end up shipping any cross-cutting extensions we can delete the experimental APIs.
I think the AddOtlpExporter method needs to remain the steel thread for vetting this work. I agree that the work in this PR may be nice in the end, but I don't think it contributes directly to the steel thread, so it is of lower priority at the moment.
A number of open questions were raised by folks in a recent SIG meeting which I captured in this comment https://github.com/open-telemetry/opentelemetry-dotnet/pull/5137#issuecomment-1894452554. I think circling back to these questions and vetting out possible answers to them should be the primary focus.
Let's merge this and I can do a follow-up which makes it all experimental?
Let's merge this and I can do a follow-up which makes it all experimental?
I disagree.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
Closed as inactive. Feel free to reopen if this PR is still being worked on.
@reyang @alanwest The usage on the PR description now shows how this API works with UseOtlpExporter (#5400).
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 86.20%. Comparing base (
6250307) to head (2e178c1). Report is 283 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #5325 +/- ##
==========================================
+ Coverage 83.38% 86.20% +2.82%
==========================================
Files 297 256 -41
Lines 12531 11101 -1430
==========================================
- Hits 10449 9570 -879
+ Misses 2082 1531 -551
| Flag | Coverage Δ | |
|---|---|---|
| unittests | ? |
|
| unittests-Project-Experimental | 86.20% <100.00%> (?) |
|
| unittests-Project-Stable | 85.95% <100.00%> (?) |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Files | Coverage Δ | |
|---|---|---|
| src/OpenTelemetry/OpenTelemetrySdk.cs | 100.00% <100.00%> (ø) |
|
| src/OpenTelemetry/OpenTelemetrySdkExtensions.cs | 100.00% <100.00%> (ø) |
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
A few customers might do something like the following and expect ASP.NET Core telemetries to be collected. However, since these APIs create their own tracer provider, customers may not see the expected data:
using var openTelemetry = OpenTelemetrySdk.Create(builder => builder
.ConfigureResource(r => r.AddService("MyService"))
.UseOtlpExporter());
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
.AddAspNetCoreInstrumentation()
.Build();
If we proceed with this approach, the simultaneous use of OpenTelemetrySdk.Create() and Sdk.CreateTracerProviderBuilder() should not be allowed. If both methods are used, one should throw an exception to clearly state that these APIs cannot be combined.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
This would be a nice addition for distribution authors and simplify documenting proper SDK setup, especially for .NET Framework applications.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
Closed as inactive. Feel free to reopen if this PR is still being worked on.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
Closed as inactive. Feel free to reopen if this PR is still being worked on.
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.