dd-trace-py
dd-trace-py copied to clipboard
ModuleWatchdog is installed unconditionally when ddtrace is imported
Summary of problem
ModuleWatchdog is installed unconditionally on import.
https://github.com/DataDog/dd-trace-py/blob/2107e4ca68e6772fa8fd4ce4d2145b5e7ddf1675/ddtrace/init.py#L8
This clutters stack traces with many extra lines and generally makes debugging unrelated things more difficult. It would be nice to have a way to disable that with the environment variable DD_TRACE_ENABLED=false.
It would also be nice if DD_TRACE_ENABLED was a master kill switch for all things that get installed automatically on import. For example, currently it looks like DD_INSTRUMENTATION_TELEMETRY_ENABLED is a separate switch that does not respect DD_TRACE_ENABLED=false.
How can we reproduce your problem?
- Set environment variable
DD_TRACE_ENABLED=false - Import
ddtracein a Python shell (see below) ModuleWatchdogand other things (instrumentation telemetry, etc.) should not be installed automatically on import.ModuleWatchdoginstallation can be checked by looking at the type ofsys.modules:
$ export DD_TRACE_ENABLED=false
$ python
>>> import sys
>>> type(sys.modules)
<class 'dict'>
>>> import ddtrace
>>> type(sys.modules)
<class 'ddtrace.internal.module.ModuleWatchdog'>
# expected <class 'dict'>
Thanks for this, @millerdev. Sounds like a useful bit of functionality to add. We'll add this to our backlog.
@emmettbutler on a related note, it seems the watchdog means every import ends up using importlib, which adds several frames for every import. We recently had an incident where production was failing to startup because an import chain went over the 1000 frames recursionlimit and a large part of that stacktrace was
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File ".venv/lib/python3.11/site-packages/ddtrace/internal/module.py", line 220, in _exec_module
Also, what functionality is removed if the watchdog isn't running?
Also, what functionality is removed if the watchdog isn't running?
@delfick the ModuleWatchdog is used by the tracer to ensure that the patching for integrations happens when the relevant modules are imported. Without that the tracer would have to force-load and patch the modules on boot. This means:
- potential side-effects in the applications for modules being loaded at the wrong time/order
- increased start-up time
So, currently, disabling the main ModuleWatchdog instance would break many tracer integrations.
@P403n1x87 when I tried to work out what I'd lose from turning it off I looked at https://github.com/search?q=repo%3ADataDog%2Fdd-trace-py%20ModuleWatchdog.register&type=code
It seems turning it off loses me
- iast (which I don't think we use)
- Tracing threading.Lock and asyncio equivalent (I don't think too important to us maybe hopefully)
- A debugger we don't use
What else am I missing?
potential side-effects in the applications for modules being loaded at the wrong time/order
Are you able to expand on this please?
Thanks!
This issue has been automatically closed after a period of inactivity. If it's a feature request, it has been added to the maintainers' internal backlog and will be included in an upcoming round of feature prioritization. Please comment or reopen if you think this issue was closed in error.
Requesting to reopen this issue since it still seems important for dd-trace to respect DD_TRACE_ENABLED=false.
Hello! Thank you all very much for the feedback.
Today DD_TRACE_ENABLED is mainly for preventing traces from being sent to the Datadog Agent (seen in the description: Description: Enables or disables sending traces from the application. ) . It still lets the the tracer load in the instrumentation and as you pointed out, it still loads in things like ModuleWatchdog. I agree with you that we should have an option to not load that when it's not needed.
I discussed this with the team and applied the feature request label to this Github issue, because improving this behavior will require some further decisions/discussions and long term planning.
If you have any specific use case examples to share about the impact on your application, please don't hesitate to open a support ticket with a reference to this Github issue and we'll track it on our side when discussing how to improve this.
Please reconsider the decision to ignore this bug report, and add an environment variable that disables ALL ddtrace functionality. I've noticed recently that my interpreter is taking a long time to shutdown after running tests, and breaking it reveals that ddtrace is the culprit:
Traceback (most recent call last):
File "hq/lib/python3.9/site-packages/ddtrace/internal/telemetry/writer.py", line 420, in _app_dependencies_loaded_event
packages = update_imported_dependencies(self._imported_dependencies, newly_imported_deps)
File "hq/lib/python3.9/site-packages/ddtrace/internal/telemetry/data.py", line 76, in update_imported_dependencies
dists = get_module_distribution_versions(module_name)
File "hq/lib/python3.9/site-packages/ddtrace/internal/packages.py", line 82, in get_module_distribution_versions
importlib_metadata.distribution(module_name).version,
File "/mnt/temp/dimagi/uv-local-share/python/cpython-3.9.20-linux-x86_64-gnu/lib/python3.9/importlib/metadata.py", line 542, in distribution
return Distribution.from_name(distribution_name)
File "/mnt/temp/dimagi/uv-local-share/python/cpython-3.9.20-linux-x86_64-gnu/lib/python3.9/importlib/metadata.py", line 192, in from_name
dist = next(iter(dists), None)
File "hq/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 931, in <genexpr>
path.search(prepared) for path in map(FastPath, paths)
File "hq/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 790, in search
return self.lookup(self.mtime).search(name)
File "hq/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 842, in search
self.infos[prepared.normalized]
File "hq/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 899, in __bool__
def __bool__(self):
KeyboardInterrupt
Why on earth is ddtrace in my local development environment collecting telemetry data? Please make a way that I can stop this madness!
Edit: DD_INSTRUMENTATION_TELEMETRY_ENABLED=false does appear to resolve this particular issue, but I would prefer a single kill switch environment variable for all things related to ddtrace.
@millerdev apologies for any inconvenience caused and many thanks for your feedback. We already have some work in flight to reduce the side effects from simply importing ddtrace. And it looks like you have already figured out how to turn off telemetry in your case. Hopefully we'll be able to provide our users with a better experience with the upcoming releases 🤞