snowflake-connector-python icon indicating copy to clipboard operation
snowflake-connector-python copied to clipboard

SNOW-1763555: OpenTelemetry integration utterly breaks driver if using DataDog

Open lattwood opened this issue 1 year ago • 3 comments

Python version

N/A

Operating system and processor architecture

N/A

Installed packages

ddtrace-py
opentracing-api

What did you do?

I used the driver w/ Datadog APM, which installs the opentelemetry packages that are used as the check to see if we can inject trace context.

With DataDog, this gets happens on every query, and it fails to run.

This is because the only exception caught is ModuleNotFoundError, and opentracing is throwing a ValueError which isn't getting caught.

(please forgive any typos, I copied the backtrace from a screenshot)

ValueError: Propagator tracecontext not found. It is either misspelled or not installed.
  File "/var/task/lambda_function.py", line 116, in lambda_handler
    process_scan(
  File "/var/task/store.py", line 92, in process_scan
    upload_file_to_snowflake(cursor, secrets, file_name=sv_file, table=snowflake_table)
  File "/var/task/store.py", line 55, in upload_file_to_snowflake
    cursor.execute (
  File "/var/task/snowflake/connector/cursor.py", line 984, in execute
    ret = self._execute_helper(query, **kwargs)
  File "/var/task/snowflake/connector/cursor.py", line 695, in _execute_helper
    ret = self._connection.cmd_query(
  File "/var/task/snowflake/connector/connection.py", line 1350, in cd_query
    ret = self. rest. request(
  File "/var/task/snowflake/connector/network.py", line 483, in request
    from opentelemetry.propagate import inject
  File "<frozen importlib._bootstrap»", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap»", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap»", Line 690, in _load_unlocked
  File "/var/task/ddtrace/internal/module.py", line 295, in _exec_module
    self.loader.exec_module (module)
  File "/var/task/opentelemetry/propagate/__init__.py", line 149, in ‹module>
    raise ValueError(

What did you expect to see?

I would expect that opentelemetry integration would be done via this- https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation

and not via a hack in the actual library that breaks the driver for anyone that has opentelemetry installed but not in-use.

I hope that this issue results in that change being reverted, and the original requestor of the feature gets this done properly in the open-telemetry project.

Can you set logging to DEBUG and collect the logs?

We already have it at DEBUG and that's the problem.

lattwood avatar Oct 25 '24 13:10 lattwood

This was directly caused by #1989

lattwood avatar Oct 25 '24 13:10 lattwood

Aside- what kind of commit title is test?

image

lattwood avatar Oct 25 '24 13:10 lattwood

User has to import opentelemetry and configure it to have this. This code executes only if the import succeeds which requires user to import and configure opentelemetry.

specifically this is the erroneous assumption, there are many ways the import can fail.

lattwood avatar Oct 25 '24 13:10 lattwood

affected by this as well any estimate to resolve?

schammah avatar Nov 11 '24 14:11 schammah

Seems like a straightforward issue to fix... Can we get some eyes on this one? @sfc-gh-mkeller @sfc-gh-yuwang @sfc-gh-aling @sfc-gh-sghosh @sfc-gh-yixie @sfc-gh-aalam

fbexiga avatar Nov 11 '24 20:11 fbexiga

thanks folks for drawing attention to this, we'll take a look.

sfc-gh-dszmolka avatar Nov 12 '24 07:11 sfc-gh-dszmolka

Thanks for your bug report, this is very valuable input. Unfortunately, after further investigation this proves to be an issue specific to DataDog because by just including a clean opentelemetry-api dependency does not reproduce this problem, which means that what they include is not the "full" opentelemetry-api (see the fact that in opentelemetry-api the tracecontext is available see here and here and here(here you can see the exception being thrown).

This is still a problem that Snowflake can fix by catching all the exceptions there, though that is not the right way of writing code, since you cannot have a catch all exception every time you call any API that should not return an exception, but Snowflake will do it anyway since tracing should not break your data aplication.

sfc-gh-bdrutu avatar Nov 14 '24 01:11 sfc-gh-bdrutu

@sfc-gh-bdrutu Based on what you've said, I've taken the liberty of opening an issue on the ddtrace-py repository- https://github.com/DataDog/dd-trace-py/issues/11407

lattwood avatar Nov 14 '24 14:11 lattwood

I would expect that opentelemetry integration would be done via this- https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation

and not via a hack in the actual library that breaks the driver for anyone that has opentelemetry installed but not in-use.

I hope that this issue results in that change being reverted, and the original requestor of the feature gets this done properly in the open-telemetry project.

Based on my experience with OpenTelemetry in different environments, OpenTelemetry was designed with flexibility in mind, supporting scenarios where telemetry is directly embedded within libraries, rather than requiring separate instrumentation libraries.

bogdandrutu avatar Nov 14 '24 16:11 bogdandrutu

I'm facing this as well. Specifically, incompatible with https://github.com/DataDog/datadog-lambda-python/releases/tag/v6.99.0 Was this introduced by https://github.com/snowflakedb/snowflake-connector-python/commit/0d005199204a90082ec22b63008bb6e44500ec44?

PaulBormanTR avatar Nov 14 '24 16:11 PaulBormanTR

@bogdandrutu you literally have the same bio on your two accounts.

Image Image

lattwood avatar Nov 14 '24 16:11 lattwood

Since I was unable to reproduce the issue I'd really like someone with the issue to test drive the change I proposed in #2106 . Please don't forget to let us know if it worked properly!

sfc-gh-mkeller avatar Nov 14 '24 19:11 sfc-gh-mkeller

merging the PR autoclosed this issue, however it needs a release first, so reopened until release is out. Please folks, if you're able, test the fix (by installing the driver from main) and let us know how it worked for you. Thank you in advance !

sfc-gh-dszmolka avatar Nov 17 '24 18:11 sfc-gh-dszmolka

released in PythonConnector v3.12.4

sfc-gh-dszmolka avatar Dec 04 '24 15:12 sfc-gh-dszmolka

https://github.com/open-telemetry/opentelemetry-python/blob/415c94fb9834ce38459e58d5ee192a98494c4c76/opentelemetry-api/src/opentelemetry/context/init.py#L42-L64

We're getting a ton of log spam due to this.

Failed to load context: contextvars_context, fallback to contextvars_context
Traceback (most recent call last):
  File "/var/task/opentelemetry/context/__init__.py", line 43, in _load_runtime_context
    return next(  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^
StopIteration

Which it appears is due to datadog setting up a different context provider for if someone enables their opentelemetry integration.

Please implement opentelemetry in a way that it can be disabled with an environment variable, or put the functionality into a separate package.

lattwood avatar Dec 04 '24 20:12 lattwood

Hi @lattwood

You should talk to the DD folks about this problem since our usage is a legit and correct usage of the opentelemetry-api.

Which it appears is due to datadog setting up a different context provider for if someone enables their opentelemetry integration.

Talk to them maybe they shouldn't do that, or they should correctly do it.

Please implement opentelemetry in a way that it can be disabled with an environment variable, or put the functionality into a separate package.

We are not "implementing" opentelemetry, we are only propagate the context to Snowflake and as mentioned our usage is correct and is DD causing your issue.

sfc-gh-bdrutu avatar Dec 05 '24 21:12 sfc-gh-bdrutu