dd-trace-py
dd-trace-py copied to clipboard
AttributeError when using Sampling Rules Tags as `global_tags` with `ddtrace.opentracer.tracer.Tracer`
When using the trace sampling rules tags, as shown in the Datadog APM docs, as global tags, an AttributeError is thrown when creating a span.
Which version of dd-trace-py are you using?
0.47.0
Which version of the libraries are you using?
ddtrace==0.47.0
intervaltree==3.1.0
opentracing==2.4.0
protobuf==3.15.6
six==1.15.0
sortedcontainers==2.3.0
tenacity==7.0.0
wrapt==1.12.1
How can we reproduce your problem?
import ddtrace
from ddtrace.opentracer import Tracer as DatadogOpenTracer
from ddtrace.opentracer.settings import ConfigKeys
config = {
# The value is ignored for this tag. See
# https://github.com/DataDog/dd-trace-py/blob/v0.47.0/ddtrace/span.py#L245
ConfigKeys.GLOBAL_TAGS: {MANUAL_KEEP_KEY: None},
}
tracer = DatadogOpenTracer(service_name, config=config, dd_tracer=ddtrace.tracer,)
tracer.start_active_span(operation_name)
What is the result that you get?
Traceback (most recent call last):
File "test-module/test.py", line 25, in test
File "/.pyenv/versions/test-virtualenv/lib/python3.6/site-packages/ddtrace/opentracer/tracer.py", line 164, in start_active_span
ignore_active_span=ignore_active_span,
File "/.pyenv/versions/test-virtualenv/lib/python3.6/site-packages/ddtrace/opentracer/tracer.py", line 266, in start_span
service=self._service_name,
File "/.pyenv/versions/test-virtualenv/lib/python3.6/site-packages/ddtrace/tracer.py", line 497, in start_span
span.set_tags(self.tags)
File "/.pyenv/versions/test-virtualenv/lib/python3.6/site-packages/ddtrace/span.py", line 290, in set_tags
self.set_tag(k, v)
File "/.pyenv/versions/test-virtualenv/lib/python3.6/site-packages/ddtrace/span.py", line 246, in set_tag
self.context.sampling_priority = priority.USER_KEEP
AttributeError: 'NoneType' object has no attribute 'sampling_priority'
What is the result that you expected?
A span is successfully created with the sampling priority correctly set. Tentatively, I believe this can be fixed with by passing the datadog context to the span, before the ddtrace and the opentracing span and their contexts are merged. That can be done by passing in context=context
when creating a ddtrace.span.Span
at https://github.com/DataDog/dd-trace-py/blob/v0.47.0/ddtrace/tracer.py#L433 and https://github.com/DataDog/dd-trace-py/blob/v0.47.0/ddtrace/tracer.py#L451
Something additional that I found that makes me wonder if I'm prepared to make a PR. It's not clear how the maintainers would prefer this...
I've since tried a new approach, having discovered some special tag handling on the OpenTracing side in ddtrace.opentracer.span.Span.set_tag
import ddtrace
from ddtrace.ext import priority
from ddtrace.opentracer import Tracer as DatadogOpenTracer
from ddtrace.opentracer.tags import Tags
from ddtrace.opentracer.settings import ConfigKeys
config = {
ConfigKeys.GLOBAL_TAGS: {Tags.SAMPLING_PRIORITY: priority.USER_KEEP},
}
tracer = DatadogOpenTracer(service_name, config=config, dd_tracer=ddtrace.tracer,)
tracer.start_active_span(operation_name)
However, it doesn't appear that global tags are ever applied to the opentracing span. Which is to say, global tags don't come in to play at https://github.com/DataDog/dd-trace-py/blob/v0.47.0/ddtrace/opentracer/tracer.py#L280
General workaround:
Use a callback to call span.set_tag(Tags.SAMPLING_PRIORITY, priority.USER_KEEP)
when initializing the OpenTracing integration. Both Django and Flask's libraries under opentracing-contrib/ has this feature.
Also, perhaps some info on OpenTracing: The reason that the tag here is different from my first example is that it's an OpenTracing best practice to support that tag.
hey @ndwhelan thank you for reaching out!
Do you mind me asking your use case to configure a global tag for sampling priority/manual keep?
@brettlangdon we have a scenario where a sampling rate configured by the clients of a service may be inadequate for the service. The idea is, setting priority manually should give us a partial trace starting at the service even if the caller's sampling logic decided not to sample the trace. Please let us know if there's a better way to approach this problem.
@deadok22 got it, to reiterate to ensure I am understanding correctly.
You have:
Service A -> Service B
Service A is controlling the sampling decision, but you want Service B to send more traces, even if that means you have Service B traces missing the Service A portion in the flamegraph.
Is this accurate?
Yes, that's exactly what we're after
@deadok22 thank you for confirming. Let me check with some people internally, I think we have alternative methods we are suggesting for people, but I need to make sure I have my information correct first.
@deadok22 sorry for the delay getting back to you. Would you be able to expand a bit more on why you want to manually adjust the sampling decision for "Service B" (in our example)?
Is your concern around having more flamegraphs in our UI to inspect, around the impact of sampling on our trace.*
metrics, or something else entirely?
Hey Brett. If it's not clear, Sergey and I work together, but in very different time zones. I can at least partially answer this.
It's something else entirely.
We have sampling rules for core shared infrastructure (service A) with sampling rates that make sense for the amount of traffic it sees. Just downstream, we have services that are incredibly sparsely called, and we really want APM data for each request to this service (service B). Actually, consider myriad service b through z. We want individual teams and developers to determine different sampling rates for services b through z based on their needs. 100% for b, 50% for c, 65% for d, and so on, nondeterministically. That desire is unachievable if service a is in front of service b, c, d,...,z.
We opened a support ticket through Datadog's Zendesk support site. I think it's best we continue any conversations on more details or why this is the case there, as it may reveal internal details inappropriate for this medium.
I hope this helps!
Best,
Nick
@ndwhelan and @deadok22 sorry for the long delay, this is something we have been discussing internally. We don't have a clear solution for this right now, but we are still discussing how this could be done either short term (like you are attempting now) or if there is a longer term solution/timeline.
@brettlangdon I just wanted to share that using the workaround with a callback has been working for my team.
@ndwhelan thanks for sharing the update. I am going to keep this issue open since it is a legitimate use case that we don't have a great solution for yet.
hey @ndwhelan and @deadok22, sorry for the long delay getting back to you, we have been discussing this internally to determine the best option.
Today we don't have any alternative solution other than to set MANUAL_KEEP
(USER_KEEP
) on any downstream services that you wish to override sampling decision on.
We have gotten similar use cases from other customers so we intend to brainstorm alternative options and find a better path forward in the future.
Hey @brettlangdon, any updates here?
Manually doing what the Github Action was supposed to do (fixed in https://github.com/DataDog/dd-trace-py/pull/7835):
This issue has been automatically closed after six months of inactivity. If it's a feature request, it has been added to the maintainers' internal backlog and will be included in an upcoming round of feature prioritization. Please comment or reopen if you think this issue was closed in error.