snowflake-connector-python
snowflake-connector-python copied to clipboard
SNOW-591884: pyarrow dependencies
While Snowflake's library is a library we use, it isn't the only library we use and require. Each library has a different version dependency of pyarrow, which can lead to conflicts.
Please do NOT make this library tied to a specific version of pyarrow, especially since not every version is a breaking change, instead it should be tested with and enable using various versions of pyarrow by checking which version of pyarrow is installed and adapt accordingly.
This would be extremely helpful
Please do NOT make this library tied to a specific version of pyarrow
I'm sorry, but we cannot loosen our pinning of pyarrow
. However; I think that we have a good reason for this.
Our C-extension is compiled against specific pyarrow
version and Arrow's internal APIs do change between major releases.
Upvote, even though I realize that the suggestion is a bit of a pain. Or perhaps arrow could be optional install? I suspect many will uninstall pyarrow after chasing the cause of their segfault, since they may just use pandas.
Upvote as well
I'm sorry, but we cannot loosen our pinning of
pyarrow
.
@sfc-gh-mkeller would you please elaborate? Perhaps I'm missing something, but AFAICT, this package makes very limited use of pyarrow
, to implement a few methods (like SnowflakeCursor.fetch_arrow_batches
) that are entirely optional to use—not core functionality. Moreover, in looking at the issues in this repo, it seems that pyarrow
upgrades are consistently treated as relatively low priority, with https://github.com/snowflakedb/snowflake-connector-python/pull/1349 being the latest example.
The end result is that many users of this package are blocked from making meaningful upgrades, e.g. to pyarrow
10 or Python 3.11, because a few ancillary features in this package require an outdated version of a great library.
Is there a reason, besides limited developer time, that pyarrow
upgrades can't be prioritized, or better yet, the package made optional? Is there data showing heavy use of the pyarrow
-based features, and/or performance tests that show significant improvements from using it? (Absent any data, I'd guess the latter is not true, since the Snowflake API always returns row batches of JSON strings. If pyarrow
was being used to implement the Flight SQL protocol, the story might be different.)
Please consider changing the policy here. It would be a big win for many users, with zero cost for users who want to continue using a Snowflake-compatible version of pyarrow
.
I am currently trying to upgrade a project to python3.11 but am blocked by this issue. PYarrow is on version 10.x, but snowflake-connector specifies v8.x, which will not build under python3.11, so our web app cannot be upgraded.
Similarly here, I'm trying to train some recommender systems with torchrec that uses pyarrow=10.0.1
but I wasn't able to find a snowflake-connector-python
version to match those requirements 😢
Same here, I have newer pyarrow
installed and snowflake-connector-python
always failed on installation.
Would love to see this made optional or upgraded as well.
@akravetz I just upgraded to snowflake-connector 3.x, then tried updating to python3.11 again, and this time it worked. Problem solved!
We are looking into improving our arrow dependency story over the next couple of quarters and will have an update here by the end of May 2023
you're behind a major version release at this point i think. 11 vs 10
👍 pandas
and pyarrow
restrictions will makesnowflake-connector-python
connector unusable with latest versions of dask
.
Hi All ,
We have released a new preview version of connector with reduced sized with nanoarrow and removing the restriction of pyarrow dependency which you can check at this blog post https://medium.com/snowflake/supercharging-the-snowflake-python-connector-with-nanoarrow-8388cb57eeba
Do let us know your feedback. Do note this is still in preview, so we dont recommend it used for production.
Thanks Anurag
Hi all, we're thrilled to announce that snowflake-connector-python 3.5.0 is released which removes the restriction of pyarrow dependency as well as reduces the package size: https://pypi.org/project/snowflake-connector-python/3.5.0/
please give it a try!