snowpark-python icon indicating copy to clipboard operation
snowpark-python copied to clipboard

SNOW-2046081: Library is not threadsafe when multiple sessions/roots exist due to globally shared state

Open corrylc opened this issue 8 months ago • 5 comments

Python version

3.12.7

Operating system and processor architecture

macOS-15.4-arm64-arm-64bit

Installed packages

snowflake==1.3.0
snowflake-connector-python==3.14.0
snowflake-core==1.3.0
snowflake-legacy==1.0.0
snowflake-snowpark-python==1.30.0

What did you do?

Difficult to show exactly, but using the Snowpark Session and Root, I created multiple sessions and roots.

Each session connects to a different Snowflake account.

I then used the sessions in a multi-threaded context, where API requests were issued on both sessions at the same time, in parallel threads.

What did you expect to see?

I expected to see the requests complete at roughly the same time, as each thread executed the API request using its own Session/Root.

Instead, some subset of threads returned 401 unauthorized errors. When not threaded, no authorization issues existed.

After investigating further, I found that snowflake.core.*._generated.ApiClient and snowflake.core.*._generated.Configuration both use a pattern where they store a _default instance of the class, as a classvar. Because this "default" is stored on the class, it bleeds across sessions/roots as it is effectively globally shared state.

The issue isn't visible with only one session, and is impossible to reach even with multiple sessions, until the code is sufficiently pushed into separate and parallel threads.

When triggered, the ApiClient and Configuration instances are constantly stomping on each other across threads/sessions, resulting in extremely confusing behavior, and usually 401 Unauthorized errors.

A workaround, which has to be implemented for each API client, is the following (example only shows fix on session and user APIs):

        from snowflake.core.session._generated import ApiClient, Configuration
        root._snowapi_session._api._api_client = ApiClient(root, Configuration())

        from snowflake.core.user._generated import ApiClient, Configuration
        root._users._api._api_client = ApiClient(root, Configuration())

This workaround basically removes the shared _default instances, and replaces them with a unique instance for each session.

The fix is to remove the use of _default as a class variable, and create a new ApiClient/Configuration for each session.

Can you set logging to DEBUG and collect the logs?

Issue not shown in logs, and redaction is too complicated at this scale.

Note

Initially filed on snowflake-connector-python project, where it was pointed out that snowflake.core resides with this repo.

While I used the Snowpark API to trigger this, the issue is primarily in snowflake.core.Root and the various generated APIs.

corrylc avatar Apr 17 '25 15:04 corrylc