snowflake-connector-python icon indicating copy to clipboard operation
snowflake-connector-python copied to clipboard

SNOW-1239684: Snowflake connector telemetry is toggled remotely and always sends internal data to Snowflake

Open dbold opened this issue 1 year ago • 3 comments

Python version

Python 3.11.6 (main, Jan 9 2024, 11:01:12) [GCC 11.4.0]

Operating system and processor architecture

Linux-5.15.0-100-generic-x86_64-with-glibc2.35

Installed packages

aiohttp==3.9.3
aiosignal==1.3.1
asn1crypto==1.5.1
attrs==23.2.0
certifi==2023.11.17
cffi==1.16.0
charset-normalizer==3.3.2
cryptography==42.0.5
filelock==3.13.1
frozenlist==1.4.1
greenlet==3.0.3
idna==3.6
influxdb-client==1.39.0
multidict==6.0.5
numpy==1.26.4
packaging==23.2
pandas==2.2.1
platformdirs==3.11.0
pycparser==2.21
PyJWT==2.8.0
pyOpenSSL==24.0.0
python-dateutil==2.8.2
pytz==2024.1
reactivex==4.0.4
requests==2.31.0
six==1.16.0
snowflake-connector-python==3.7.1
snowflake-sqlalchemy==1.5.1
sortedcontainers==2.4.0
SQLAlchemy==1.4.51
tomlkit==0.12.4
typing_extensions==4.9.0
tzdata==2024.1
urllib3==2.1.0
yarl==1.9.4

What did you do?

It looked from our audits as if the Snowflake Connector starts with a disabled telemetry and there is a way to progranmatically toggle it, if one so desires.

To our great surprise the Snowflake Connector exfiltrates data and does send telemetry no matter what.

This happens because the telemetry parameter is enabled remotely by Snowflake server.

This is done early, during the authentication: the session_parameters are updated based on the server response https://github.com/snowflakedb/snowflake-connector-python/blob/main/src/snowflake/connector/auth/_auth.py#L470

The server response contains among others in data.parameters the telemetry keys:

{'name': 'CLIENT_TELEMETRY_ENABLED', 'value': True},
{'name': 'CLIENT_TELEMETRY_SESSIONLESS_ENABLED', 'value': True},

With the updates session_parameters, Auth calls self._rest._connection._update_parameters(session_parameters) which will update telemetry_enabled = True one the connection.

Furthermore, since the connection calls _log_telemetry_imported_packages which means at least a log even (with all the packages) is saved (in the buffer) even before the connection is done.

Interestingly, the list of imported packages is a rather intrusive log to send.

And, at the end, just closing the client will flush telemetry and send the data externally.

Example:

import snowflake.connector

with snowflake.connector.connect(
    user='x',
    password='y',
    account='z',
    warehouse='i',
    database='d',
    validate_default_parameters=True
    ) as c:
    c.telemetry_enabled = False

At the end of this short program we see telemetry got enabled (through the server reply) and that data was sent.

2024-03-15 22:59:19,823 - MainThread connection.py:734 - close() - INFO - closed
2024-03-15 22:59:19,823 - MainThread telemetry.py:211 - close() - DEBUG - Closing telemetry client.
2024-03-15 22:59:19,823 - MainThread telemetry.py:176 - send_batch() - DEBUG - Sending 1 logs to telemetry. Data is {'logs': [{'message': {'driver_type': 'PythonConnector', 'driver_version': '3.7.1', 'source': 'PythonConnector', 'type': 'client_imported_packages', 'value':...

What did you expect to see?

We did not expect to see CLIENT_TELEMETRY_ENABLED being set based on a server reply.

If this is a user settings, we would like to see where to configure it for our account.

But, as a library, it makes little sense for the Snowflake connector to just take all session_parameters as-is from the server reply.

The telemetry parameters should be explicitly excluded.

Fundamentally, there should be an implicit or easy way for no telemetry to be ever sent.

Can you set logging to DEBUG and collect the logs?

Relevant logs:

2024-03-15 22:59:19,823 - MainThread connection.py:734 - close() - INFO - closed
2024-03-15 22:59:19,823 - MainThread telemetry.py:211 - close() - DEBUG - Closing telemetry client.
2024-03-15 22:59:19,823 - MainThread telemetry.py:176 - send_batch() - DEBUG - Sending 1 logs to telemetry. Data is {'logs': [{'message': {'driver_type': 'PythonConnector', 'driver_version': '3.7.1', 'source': 'PythonConnector', 'type': 'client_imported_packages', 'value

dbold avatar Mar 15 '24 21:03 dbold

@dbold could you try this?

from snowflake.connector.telemetry_oob import TelemetryService
conn = snowflake.connector.connect(**CONNECTION_PARAMETERS)
# disable in-band telemetry
conn.telemetry_enabled = False
# disable out-of-band telemetry
TelemetryService.get_instance().disable()

sfc-gh-yixie avatar Apr 02 '24 18:04 sfc-gh-yixie

This does not seem to work:

with snowflake.connector.connect(...) as c:
    c.telemetry_enabled = False

    from snowflake.connector.telemetry_oob import TelemetryService
    # disable out-of-band telemetry
    TelemetryService.get_instance().disable()

as the logs show telemetry is sent:

2024-04-03 08:05:51,308 - MainThread connection.py:734 - close() - INFO - closed
2024-04-03 08:05:51,308 - MainThread telemetry.py:211 - close() - DEBUG - Closing telemetry client.
2024-04-03 08:05:51,309 - MainThread telemetry.py:176 - send_batch() - DEBUG - Sending 1 logs to telemetry. Data is {'logs': [{'message': {'driver_type': 'PythonConnector', 'driver_version': '3.7.1', 'source': 'PythonConnector', 'type': 'client_imported_packages', 'value': "{'ntpath', 'random', 'ctypes', 'itertools', 'quopri', 'asn1crypto', 'opcode', 'builtins', 'hashlib', 'logging', 'certifi', 'platform', 'inspect', 'enum'....

An attempt to disable telemetry we explored is just changing the telemetry URL:

# Try to break the telemetry client with a wrong url. It will auto-disable itself after sending the 1st packet and failing.
snowflake.connector.telemetry.TelemetryClient.SF_PATH_TELEMETRY = "/please-stop/sending"

But this seems to cause some other problems and I'll probably open a separate issue.

dbold avatar Apr 03 '24 04:04 dbold

@dbold We're reviewing what's next for telemetry. Will update you later.

sfc-gh-yixie avatar Apr 03 '24 05:04 sfc-gh-yixie

Is there any update for this issue?

craigls avatar Aug 09 '24 14:08 craigls

I kept getting warnings in my log from telemetry: WARNING:snowflake.connector.telemetry:Failed to add log to telemetry. and I couldn't find a way to disable telemetry when instantiating the connector client.

It's a little hacky but I ended up setting telemetry_enabled to False overriding the default value in the SnowflakeConnection class' _update_parameters() function. I no longer see the warning/error.

dougdragon avatar Aug 16 '24 13:08 dougdragon

we have fixed the bug of disabling telemetry in the latest release 3.12.1 and included a section on how to disable telemetry in our readme: https://github.com/snowflakedb/snowflake-connector-python?tab=readme-ov-file#disable-telemetry.

please try out the latest version, thanks!

sfc-gh-aling avatar Aug 20 '24 21:08 sfc-gh-aling

For reference, the code changes seem to be https://github.com/snowflakedb/snowflake-connector-python/pull/2013

We will check the new version.

dbold avatar Aug 21 '24 06:08 dbold