opentelemetry-python-contrib
opentelemetry-python-contrib copied to clipboard
allow a user-defined function for providing span attributes on engine/connect span begin
Description
Add attrs_provider as an optional kwarg for SQLAlchemyInstrumentor().instrument. If attrs_provider (a 0 argument callable) is provided, it will be evaluated before any calls to start_current_span within the engine or connection instrumentation. This allows users who have an attribute aware sampler to inject attributes into the sampler to control sample behavior.
Specifically, for my usecase, I have a subclass of the ParentBasedTraceIdRatio called OverrideableParentBasedTraceIdRatio which allows passing of an 'x-ignore-sample' attribute which forces the sampler to emit a DROP decision when the parent is not already sampled. I would like to force all SqlAlchemy spans to be dropped unless the parent is sampled, which necessitates this change.
Fixes https://github.com/open-telemetry/opentelemetry-python-contrib/issues/2788
Type of change
- [x] New feature (non-breaking change which adds functionality)
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
- [x] New test in instrumentation/opentelemetry-instrumentation-sqlalchemy/tests/test_sqlalchemy.py
Does This PR Require a Core Repo Change?
- [x] No.
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.
- [x] Followed the style guidelines of this project
- [x] Changelogs have been updated
- [x] Unit tests have been added
- [x] Documentation has been updated
The committers listed above are authorized under a signed CLA.
- :white_check_mark: login: jochs / name: Jess O (63a0d9aaccb596f873cafeee2e6f11072a2bb1dc, fd63aadc511a4e5dd4959d35f35b1fd1c2cf02f0, fcabdd7c4ee93dfd935e823b4a76470b9c7e1c42, 9c1e33dc8f598f2a5248ec147d2e001abb78352f, a96d919c135cb76a19e8f43a10698f51ffc669d0, 27e634e46cb2c0f3ed14238ba31d9f0a76421530)
- :white_check_mark: login: emdneto / name: Emídio Neto (cd2b2e96fffa7d8143e544edbb9470ec31955fc2)
hey @shalevr - do you think you'd be able to review this sometime soon? Happy to create an issue if you need additional context
Specifically, for my usecase, I have a subclass of the ParentBasedTraceIdRatio called OverrideableParentBasedTraceIdRatio which allows passing of an 'x-ignore-sample' attribute which forces the sampler to emit a DROP decision when the parent is not already sampled. I would like to force all SqlAlchemy spans to be dropped unless the parent is sampled, which necessitates this change.
I'm a bit confused with the behavior of this custom sampler. Would you be able to provide a specific example? Can't you achieve this behavior with a custom ParentBased sampler?
hey @lzchen sorry I missed this. Yes, I am getting the behavior with a custom ParentBased sampler, but the sample decision depends on the attributes, and I currently can't provide attributes to the OTEL-sqlalchemy spans.
The sampler code looks like this:
class OverrideableParentBasedTraceIdRatio(ParentBasedTraceIdRatio):
@override
def should_sample(
self,
parent_context: Optional[Context],
trace_id: int,
name: str,
kind: Optional[SpanKind] = None,
attributes: Optional[Attributes] = None,
links: Optional[Sequence[Link]] = None,
trace_state: Optional[TraceState] = None,
) -> SamplingResult:
if attributes is not None:
if not not attributes.get(X_SAMPLED):
return SamplingResult(
decision=Decision.RECORD_AND_SAMPLE,
attributes=attributes,
trace_state=_get_parent_trace_state(parent_context),
)
parent_span_context = get_current_span(parent_context).get_span_context()
if not parent_span_context.trace_flags.sampled and not not attributes.get(X_IGNORE_SAMPLE):
return SamplingResult(
decision=Decision.DROP,
attributes=attributes,
trace_state=_get_parent_trace_state(parent_context),
)
return super().should_sample(parent_context, trace_id, name, kind, attributes, links, trace_state)
And the idea is that I want to sample X% of the time in general but if I pass in the relevant attributes I can force record and sample or force drop. This is generally helpful for hot library code where I don't want to pay the overhead of tracing unless a span is already active via app code. Currently I'm somewhat blocked on sqlalchemy instrumentation because I don't want to pay that overhead cost on 5% of all sqlalchemy function calls (which is what the ratio is set at in production).