schemachange
schemachange copied to clipboard
Dependency lock for Pandas (and arguably all packages) is too strict
TLDR: I believe Pandas should either be refactored out of the library entirely, or the upper bound on the version should be removed. There is also a very strong argument for removing the other upper bounds.
Upper bounds on dependencies are well-intended in practice to provide assistance to users doing standalone installations, but they can also have the opposite effect of making installation in more complex environments a burden.
The setup.cfg
has the following dependencies:
install_requires =
jinja2~=3.0
pandas~=1.3
pyyaml~=6.0
snowflake-connector-python>=2.8,<4.0
All of these strike me as a little too strict, given that only core API behavior is being utilized for each of the packages.
Pandas is the most noteworthy. Pandas is hardly being used in the code in the first place. (In fact, the code could be refactored so that it is not even a dependency.) What little is being used is core API behavior. And, unlike pyyaml and jinja2, there exists a major version update (Pandas 2.x) that has a semantic version which is incompatible with the upper bound of <2
. There needn't be any version locking whatsoever for this.
While we are at it:
- PyYaml is also hardly being used; just the core API, which is not subject to any serious change.
- Jinja2 is a core part of the library, but it has a very stable API. Major version bumps have historically had very minor API implications and are consistently migrated very smoothly.
Overall these upper bounds strike me as overly cautious, especially in the case of Pandas.