dd-trace-py icon indicating copy to clipboard operation
dd-trace-py copied to clipboard

ModuleWatchdog is installed unconditionally when ddtrace is imported

Open millerdev opened this issue 1 year ago • 1 comments

Summary of problem

ModuleWatchdog is installed unconditionally on import.

https://github.com/DataDog/dd-trace-py/blob/2107e4ca68e6772fa8fd4ce4d2145b5e7ddf1675/ddtrace/init.py#L8

This clutters stack traces with many extra lines and generally makes debugging unrelated things more difficult. It would be nice to have a way to disable that with the environment variable DD_TRACE_ENABLED=false.

It would also be nice if DD_TRACE_ENABLED was a master kill switch for all things that get installed automatically on import. For example, currently it looks like DD_INSTRUMENTATION_TELEMETRY_ENABLED is a separate switch that does not respect DD_TRACE_ENABLED=false.

How can we reproduce your problem?

  • Set environment variable DD_TRACE_ENABLED=false
  • Import ddtrace in a Python shell (see below)
  • ModuleWatchdog and other things (instrumentation telemetry, etc.) should not be installed automatically on import. ModuleWatchdog installation can be checked by looking at the type of sys.modules:
$ export DD_TRACE_ENABLED=false
$ python
>>> import sys
>>> type(sys.modules)
<class 'dict'>
>>> import ddtrace
>>> type(sys.modules)
<class 'ddtrace.internal.module.ModuleWatchdog'>
# expected <class 'dict'>

millerdev avatar May 01 '24 14:05 millerdev

Thanks for this, @millerdev. Sounds like a useful bit of functionality to add. We'll add this to our backlog.

emmettbutler avatar May 02 '24 15:05 emmettbutler

@emmettbutler on a related note, it seems the watchdog means every import ends up using importlib, which adds several frames for every import. We recently had an incident where production was failing to startup because an import chain went over the 1000 frames recursionlimit and a large part of that stacktrace was

  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File ".venv/lib/python3.11/site-packages/ddtrace/internal/module.py", line 220, in _exec_module

Also, what functionality is removed if the watchdog isn't running?

delfick avatar Sep 02 '24 04:09 delfick

Also, what functionality is removed if the watchdog isn't running?

@delfick the ModuleWatchdog is used by the tracer to ensure that the patching for integrations happens when the relevant modules are imported. Without that the tracer would have to force-load and patch the modules on boot. This means:

  • potential side-effects in the applications for modules being loaded at the wrong time/order
  • increased start-up time

So, currently, disabling the main ModuleWatchdog instance would break many tracer integrations.

P403n1x87 avatar Sep 11 '24 14:09 P403n1x87

@P403n1x87 when I tried to work out what I'd lose from turning it off I looked at https://github.com/search?q=repo%3ADataDog%2Fdd-trace-py%20ModuleWatchdog.register&type=code

It seems turning it off loses me

  • iast (which I don't think we use)
  • Tracing threading.Lock and asyncio equivalent (I don't think too important to us maybe hopefully)
  • A debugger we don't use

What else am I missing?

potential side-effects in the applications for modules being loaded at the wrong time/order

Are you able to expand on this please?

Thanks!

delfick avatar Sep 11 '24 21:09 delfick

This issue has been automatically closed after a period of inactivity. If it's a feature request, it has been added to the maintainers' internal backlog and will be included in an upcoming round of feature prioritization. Please comment or reopen if you think this issue was closed in error.

github-actions[bot] avatar Feb 09 '25 00:02 github-actions[bot]

Requesting to reopen this issue since it still seems important for dd-trace to respect DD_TRACE_ENABLED=false.

millerdev avatar Feb 10 '25 14:02 millerdev

Hello! Thank you all very much for the feedback.

Today DD_TRACE_ENABLED is mainly for preventing traces from being sent to the Datadog Agent (seen in the description: Description: Enables or disables sending traces from the application. ) . It still lets the the tracer load in the instrumentation and as you pointed out, it still loads in things like ModuleWatchdog. I agree with you that we should have an option to not load that when it's not needed.

I discussed this with the team and applied the feature request label to this Github issue, because improving this behavior will require some further decisions/discussions and long term planning.

If you have any specific use case examples to share about the impact on your application, please don't hesitate to open a support ticket with a reference to this Github issue and we'll track it on our side when discussing how to improve this.

wantsui avatar Feb 13 '25 14:02 wantsui

Please reconsider the decision to ignore this bug report, and add an environment variable that disables ALL ddtrace functionality. I've noticed recently that my interpreter is taking a long time to shutdown after running tests, and breaking it reveals that ddtrace is the culprit:

Traceback (most recent call last):
  File "hq/lib/python3.9/site-packages/ddtrace/internal/telemetry/writer.py", line 420, in _app_dependencies_loaded_event
    packages = update_imported_dependencies(self._imported_dependencies, newly_imported_deps)
  File "hq/lib/python3.9/site-packages/ddtrace/internal/telemetry/data.py", line 76, in update_imported_dependencies
    dists = get_module_distribution_versions(module_name)
  File "hq/lib/python3.9/site-packages/ddtrace/internal/packages.py", line 82, in get_module_distribution_versions
    importlib_metadata.distribution(module_name).version,
  File "/mnt/temp/dimagi/uv-local-share/python/cpython-3.9.20-linux-x86_64-gnu/lib/python3.9/importlib/metadata.py", line 542, in distribution
    return Distribution.from_name(distribution_name)
  File "/mnt/temp/dimagi/uv-local-share/python/cpython-3.9.20-linux-x86_64-gnu/lib/python3.9/importlib/metadata.py", line 192, in from_name
    dist = next(iter(dists), None)
  File "hq/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 931, in <genexpr>
    path.search(prepared) for path in map(FastPath, paths)
  File "hq/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 790, in search
    return self.lookup(self.mtime).search(name)
  File "hq/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 842, in search
    self.infos[prepared.normalized]
  File "hq/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 899, in __bool__
    def __bool__(self):
KeyboardInterrupt

Why on earth is ddtrace in my local development environment collecting telemetry data? Please make a way that I can stop this madness!

Edit: DD_INSTRUMENTATION_TELEMETRY_ENABLED=false does appear to resolve this particular issue, but I would prefer a single kill switch environment variable for all things related to ddtrace.

millerdev avatar Mar 18 '25 19:03 millerdev

@millerdev apologies for any inconvenience caused and many thanks for your feedback. We already have some work in flight to reduce the side effects from simply importing ddtrace. And it looks like you have already figured out how to turn off telemetry in your case. Hopefully we'll be able to provide our users with a better experience with the upcoming releases 🤞

P403n1x87 avatar Mar 19 '25 11:03 P403n1x87