[Database] Enable trace context propagation between application and database via instrumentation libraries
Area(s)
area:db
What's missing?
At work, I've noticed a lack of service correlation between our application and SQL Server, such as passing the current service name, trace ID, and span ID to the database. Implementing this feature would greatly enhance our ability to trace application behaviors.
Describe the solution you'd like
Introducing an option to enable instrumentation libraries to propagate context information to databases.
Using SQL Server as an example, we can implement a ContextPropagationLevel flag with three possible values:
-
disabled: Disable context propagation, default value. -
service: Passing the currentservice.nameto the database when executing queries. -
trace: Passing the currentservice.name,trace IDandspan IDto the database when executing queries.
For an example of the trace level implementation in SQL Server, see the following PR: open-telemetry/opentelemetry-dotnet-contrib#2709
If you agree with this proposal, I'm willing to create an initial specification for SQL Server. Thank you!
Linking with https://github.com/open-telemetry/opentelemetry-specification/issues/2279
@trask, do you know what is the status of the donation from https://github.com/google/sqlcommenter?tab=readme-ov-file#sqlcommenter?
I mean mostly semantic convention/specification, it includes
- Context propagation
- W3C Traceparent
- W3C Tracestate
- Other fields (mandatory, or can be skipped for the context propagation?)
- controller
- route
- db_driver
- framework
- Should be consider also other context text based propagators if configured on the OTel (B3/B3 multi, etc).?
Should it be treated as a stable convention/experimental/not accepted?
EDIT:
The alternative approach semms to be context_info as a separate comment, and decorate all calls to the db with such statements.
@trask, do you know what is the status of the donation from https://github.com/google/sqlcommenter?tab=readme-ov-file#sqlcommenter?
it hasn't landed in the specification or semantic convention repo, so I wouldn't consider it a part of OpenTelemetry yet, even though the IP was donated to OpenTelemetry in 2021.
I'm also personally hesitant about recommending propagating trace context via SQL comments due to https://github.com/google/sqlcommenter/issues/284
We encountered the same issue with SQL Server where adding comments to queries changes their unique identifier. To address this, I proposed two context propagation levels, and the following items outline the initial implementation plan:
- For SQL Server, using
sqlcommenterinservicepropagation mode to send service-level information (which rarely changes) and leveragingSET CONTEXT_INFOintracemode to transmit more detailed data (e.g., trace and span information). - For databases that don’t have this issue, we can use
sqlcommenterto handle both propagation modes seamlessly.
It seems there are two key topics we need to discuss:
- Introducing flags to control the context propagation level
- An alternative idea (proposed by @XSAM ): Using configuration flags, such as
context-propagation.sqlcommenter: enabledandcontext-propagation.set-context-info: enabled, to enable or disable specific propagation mechanisms instead
-
Deciding on the implementation approach, e.g., whether to adopt
sqlcommenteror explore other alternatives.
- Considering the importance of database context propagation for troubleshooting, and given that there doesn’t seem to be a better place to store this information (even SQL Server’s CONTEXT_INFO has a limit of 128 bytes),
sqlcommenterstill seems to be the best option.
@Kielek @trask Both of these topics warrant further discussion, and your thoughts and suggestions would be greatly appreciated. cc @open-telemetry/semconv-db-approvers
Introducing flags to control the context propagation level
I think this is an instrumentation library specific configuration that does not need to be specified.
Notice that https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/database-spans.md does not say anything about configuration options for instrumentations.
Deciding on the implementation approach
When we have a decision then probably is is about I think it would be about documenting in https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/sql-server.md how sqlcommenter or CONTEXT_INFO can be used a a propagation carrier.
I created the initial documentation outlining the information needed for propagating context info to databases here: #2236
When we have a decision then probably is is about I think it would be about documenting in https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/sql-server.md how sqlcommenter or CONTEXT_INFO can be used a a propagation carrier.
@pellared I think the propagation carrier is likely an implementation detail specific to the instrumentation library and does not need to be documented in the semantic conventions. Please let me know if you think this section should be included.
I think the propagation carrier is likely an implementation detail specific to the instrumentation library and does not need to be documented in the semantic conventions.
I think documenting how (what means) should be used to propagate the trace context is crucial to make sure that all languages use the same mechanisms. I mean if SQL Server would have some tracing functionality then all instrumentation libraries (e.g. in Java, .NET, Node.js) should propagate the context in the same way. You already did this in the PR:
Instrumentations SHOULD propagate the context information to the SQL queries following sqlcommenter.
PS. I reviewed your PR.
Apart from SET CONTEXT_INFO, Microsoft SQL Server also has sp_set_session_context, which
- allows 8000 bytes of value, rather than only 128, so
tracestatemay fit too - takes a key along with the value, so you can use
N'traceparent'as the key and prevent this from conflicting with existing uses ofSET CONTEXT_INFOorsp_set_session_contextin applications - requires SQL Server 2016 or higher
I don't know whether either of those is saved to the transaction log.
After you have propagated the context to SQL Server, how do you use it from there: do you have T-SQL triggers that read it, or do you capture it via Extended Events?
After you have propagated the context to SQL Server, how do you use it from there: do you have T-SQL triggers that read it, or do you capture it via Extended Events?
@sincejune already made a contribution to the SQL server receiver in OTel collector. https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/9a92b994ce1c5e2a7b9814105f5c1ba426d1f252/receiver/sqlserverreceiver/scraper.go#L1049-L1051
The logs it produced from the query samples part would automatically carry the trace info.
I see, sqlServerQuerySample.tmpl reads the context_info column from sys.dm_exec_requests. I don't know whether keyed context data set with sp_set_session_context can be read from any system table in a similar fashion. It's apparently available from sys.fn_get_audit_file but I suppose that would require more configuration.
I have some applications that could benefit from trace context propagation to SQL Server, but they are currently using CONTEXT_INFO for a different purpose. The data they currently store in CONTEXT_INFO is not useful for any scraper processes to read, so I think I should just move that data over to SESSION_CONTEXT and free up CONTEXT_INFO for trace context propagation.
Hi @XSAM @sincejune et all - trying to triage this and see the status. Given the PR https://github.com/open-telemetry/semantic-conventions/pull/2495 is merged, what is left here? Can you please give an update so we know how to properly triage this? Thank you.
@joaopgrassi thanks for the heads up. I think the majority goal of the issue has been resolved after https://github.com/open-telemetry/semantic-conventions/pull/2495 is merged.
We might open more issues as following for certain databases, but now, there is no active work to it. Feel free to close it, so this won't brother you about the triage.
@XSAM thank you. Closing then.