kedro-plugins
kedro-plugins copied to clipboard
`make plugin=kedro-datasets install-test-requirements` fails
Description
Running make plugin=kedro-datasets install-test-requirements
does not work. Dependency conflicts.
Edit by @astrojuanlu: Summary and possible next steps at https://github.com/kedro-org/kedro-plugins/issues/597#issuecomment-2302102916
Context
I wanted to contribute a PR for some Polars support but I can't install the dependencies.
Steps to Reproduce
- Fork + clone repo
- Create anaconda environment with Python 3.9
conda create -n PR-kedro python=3.9
(contribution readme says 3.6+ but PyPi says 3.9+) -
conda activate PR-kedro
- Run
make plugin=kedro-datasets install-test-requirements
Expected Result
Pip should install the required libraries.
Actual Result
Pip did not.
INFO: pip is looking at multiple versions of dask[complete] to determine which version is compatible with other requirements. This could take a while.
INFO: pip is still looking at multiple versions of dask[complete] to determine which version is compatible with other requirements. This could take a while.
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
ERROR: Cannot install dask[complete]==2024.2.1 and kedro-datasets[test]==2.1.0 because these package versions have conflicting dependencies.
The conflict is caused by:
kedro-datasets[test] 2.1.0 depends on dask>=2021.10; extra == "test"
dask[complete] 2024.2.1 depends on dask 2024.2.1 (from https://files.pythonhosted.org/packages/ff/d3/f1dcba697c7d7e8470ffa34b31ca1e663d4a2654ef806877f1017ecc5102/dask-2024.2.1-py3-none-any.whl (from https://pypi.org/simple/dask/) (requires-python:>=3.9))
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
Let me know if you want the full message from Pip but I think this covers all the relevant information.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
- Kedro version used (
pip show kedro
orkedro -V
): current - Kedro plugin and kedro plugin version used (
pip show kedro-airflow
): current - Python version used (
python -V
): 3.9.18 - Operating system and version: Ubuntu 20.04
Thanks for the report @grofte , we'll look into this.
@grofte What pip version is this?
contribution readme says 3.6+ but PyPi says 3.9+
Time to update the contribution readme too 👍🏽
pip --version
pip 23.3.1 from /home/mog/anaconda3/envs/PR-kedro/lib/python3.9/site-packages/pip (python 3.9)
You're right tho, pip version does matter. I would suggest that you change pip
in the makefile to python -m pip
. Unless there's some problem with that that I am unaware of. It doesn't work with python -m pip
either tho (and it's the same pip version).
Good shout about the readme, on the other hand we need more information.
https://github.com/kedro-org/kedro/actions/runs/8158985607/job/22302183993 I am checking some CI that we run which use the make install command and it runs successfully for py39.
Most likely pip version problem.
I don't even understand how Dask depending on Dask and kedro-datasets depending on Dask gives a conflict.
Anyway, I installed dask[complete]
on it's own first, commented it out from the pyproject.toml
and ran that. It took ages and I ran out of harddrive space. So I cleaned up my harddisk. Then I removed all the non-test optional dependencies in pyproject.toml
and ran it with uv pip install -r pyproject.toml --all-extras
(and commented out pandas-gbq
since google somehow broke uv). That was really fast but uv is all or nothing when it comes to optional dependencies apparently.
Running the rest of the CI worked and make test-no-spark
gave me 944 passed, 9 skipped, 21 xfailed, 2 xpassed, 53 errors in 117.09s (0:01:57)
which I think is fair. XFAILs are from TestVideoDataset and most of the errors seem to be from AWS botocore.exceptions.ClientError: An error occurred (IllegalLocationConstraintException) when calling the CreateBucket operation: The unspecified location constraint is incompatible for the region specific endpoint this request was sent to.
.
EDIT: omg you guys, uv was sooooo much faster
EDIT: omg you guys, uv was sooooo much faster
Yep 😄 we're using it in our CI already.
About the google dependency, we requested that they yank it, since it's invalid, but they didn't seem to fully understand and they closed the issue already https://github.com/googleapis/python-bigquery/issues/1818
Anyway, I installed dask[complete] on it's own first, commented it out from the pyproject.toml and ran that. It took ages and I ran out of harddrive space. So I cleaned up my harddisk.
Ughhhhh. I'm sorry, hope it was not too painful.
I did manage to do a draft PR and I probably fucked everything up =D https://github.com/kedro-org/kedro-plugins/pull/598
I confirm this is still the case as of today.
ERROR: Cannot install dask[complete]==2024.8.0 and kedro-datasets[test]==4.1.0 because these package versions have conflicting dependencies.
The conflict is caused by:
kedro-datasets[test] 4.1.0 depends on dask>=2021.10; extra == "test"
dask[complete] 2024.8.0 depends on dask 2024.8.0 (from https://files.pythonhosted.org/packages/db/47/136a5dd68a33089f96f8aa1178ccd545d325ec9ab2bb42a3038711a935c0/dask-2024.8.0-py3-none-any.whl (from https://pypi.org/simple/dask/) (requires-python:>=3.9))
...and yet, uv pip install .[test]
was successful.
# echo ".[test]" > requirements.in
# uv pip compile requirements.in -o requirements.txt
...
# This file was autogenerated by uv via the following command:
# uv pip compile requirements.in -o requirements.txt
absl-py==2.1.0
# via
# keras
# tensorboard
# tensorflow
accelerate==0.31.0
# via
# kedro-datasets
# transformers
adlfs==2023.8.0
# via kedro-datasets
aiobotocore==2.4.2
# via s3fs
aiohappyeyeballs==2.4.0
# via aiohttp
aiohttp==3.10.5
# via
# adlfs
# aiobotocore
# datasets
# fsspec
# gcsfs
# s3fs
aioitertools==0.11.0
# via aiobotocore
aiosignal==1.3.1
# via aiohttp
antlr4-python3-runtime==4.9.3
# via omegaconf
anyio==4.4.0
# via
# httpx
# jupyter-server
appdirs==1.4.4
# via
# fs
# kedro-telemetry
# pins
argon2-cffi==23.1.0
# via jupyter-server
argon2-cffi-bindings==21.2.0
# via argon2-cffi
arrow==1.3.0
# via
# cookiecutter
# isoduration
asn1crypto==1.5.1
# via snowflake-connector-python
astunparse==1.6.3
# via tensorflow
async-lru==2.0.4
# via jupyterlab
async-timeout==4.0.3
# via
# aiohttp
# redis
atpublic==4.1.0
# via ibis-framework
attrs==24.2.0
# via
# aiohttp
# fiona
# jsonschema
# kedro
# referencing
azure-core==1.30.2
# via
# adlfs
# azure-identity
# azure-storage-blob
azure-datalake-store==0.0.53
# via adlfs
azure-identity==1.17.1
# via adlfs
azure-storage-blob==12.22.0
# via adlfs
babel==2.16.0
# via jupyterlab-server
backcall==0.2.0
# via ipython
bandit==1.7.9
# via kedro-datasets
beautifulsoup4==4.12.3
# via nbconvert
behave==1.2.6
# via kedro-datasets
bidict==0.23.1
# via ibis-framework
binaryornot==0.4.4
# via cookiecutter
biopython==1.84
# via kedro-datasets
black==22.12.0
# via
# blacken-docs
# kedro-datasets
blacken-docs==1.9.2
# via kedro-datasets
bleach==6.1.0
# via
# nbconvert
# panel
blosc2==2.5.1
# via tables
bokeh==3.4.3
# via
# dask
# holoviews
# panel
boto3==1.24.59
# via moto
botocore==1.27.59
# via
# aiobotocore
# boto3
# moto
# s3transfer
build==1.2.1
# via kedro
cachetools==5.5.0
# via
# google-auth
# kedro
certifi==2024.7.4
# via
# fiona
# httpcore
# httpx
# pyproj
# requests
# snowflake-connector-python
cffi==1.17.0
# via
# argon2-cffi-bindings
# azure-datalake-store
# cryptography
# snowflake-connector-python
cfgv==3.4.0
# via pre-commit
chardet==5.2.0
# via binaryornot
charset-normalizer==3.3.2
# via
# requests
# snowflake-connector-python
click==8.1.7
# via
# black
# click-plugins
# cligj
# cookiecutter
# dask
# distributed
# fiona
# import-linter
# kedro
click-plugins==1.1.1
# via fiona
cligj==0.7.2
# via fiona
cloudpickle==2.0.0
# via
# dask
# distributed
# kedro-datasets
# snowflake-snowpark-python
colorcet==3.1.0
# via holoviews
comm==0.2.2
# via
# ipykernel
# ipywidgets
compress-pickle==2.1.0
# via kedro-datasets
contourpy==1.2.1
# via bokeh
cookiecutter==2.6.0
# via kedro
coverage==7.6.1
# via
# kedro-datasets
# pytest-cov
cryptography==43.0.0
# via
# azure-identity
# azure-storage-blob
# moto
# msal
# pyjwt
# pyopenssl
# snowflake-connector-python
# types-pyopenssl
# types-redis
cycler==0.12.1
# via matplotlib
dask==2024.8.0
# via
# dask-expr
# distributed
# kedro-datasets
dask-expr==1.1.10
# via dask
datasets==2.2.1
# via kedro-datasets
db-dtypes==1.3.0
# via pandas-gbq
debugpy==1.8.5
# via ipykernel
decorator==5.1.1
# via
# gcsfs
# ipython
defusedxml==0.7.1
# via nbconvert
delta-spark==2.4.0
# via kedro-datasets
deltalake==0.19.1
# via
# kedro-datasets
# polars
dill==0.3.8
# via
# datasets
# kedro-datasets
# multiprocess
distlib==0.3.8
# via virtualenv
distributed==2024.8.0
# via dask
docopt==0.6.2
# via hdfs
duckdb==0.10.3
# via ibis-framework
dynaconf==3.2.6
# via kedro
et-xmlfile==1.1.0
# via openpyxl
exceptiongroup==1.2.2
# via
# anyio
# pytest
execnet==2.1.1
# via pytest-xdist
fastjsonschema==2.20.0
# via nbformat
filelock==3.15.4
# via
# huggingface-hub
# kedro-datasets
# snowflake-connector-python
# torch
# transformers
# triton
# virtualenv
fiona==1.9.6
# via geopandas
flatbuffers==24.3.25
# via tensorflow
fqdn==1.5.1
# via jsonschema
frozenlist==1.4.1
# via
# aiohttp
# aiosignal
fs==2.4.16
# via triad
fsspec==2023.1.0
# via
# adlfs
# dask
# datasets
# gcsfs
# huggingface-hub
# ibis-framework
# kedro
# pins
# s3fs
# torch
# triad
gast==0.6.0
# via tensorflow
gcsfs==2023.1.0
# via
# kedro-datasets
# pins
geopandas==0.14.4
# via kedro-datasets
gitdb==4.0.11
# via gitdb2
gitdb2==4.0.2
# via gitpython
gitpython==3.0.6
# via
# kedro
# trufflehog
google-api-core==2.19.1
# via
# google-cloud-bigquery
# google-cloud-core
# google-cloud-storage
# pandas-gbq
google-auth==2.34.0
# via
# gcsfs
# google-api-core
# google-auth-oauthlib
# google-cloud-bigquery
# google-cloud-core
# google-cloud-storage
# pandas-gbq
# pydata-google-auth
google-auth-oauthlib==1.2.1
# via
# gcsfs
# pandas-gbq
# pydata-google-auth
google-cloud-bigquery==3.25.0
# via pandas-gbq
google-cloud-core==2.4.1
# via
# google-cloud-bigquery
# google-cloud-storage
google-cloud-storage==2.18.2
# via gcsfs
google-crc32c==1.5.0
# via
# google-cloud-storage
# google-resumable-media
google-pasta==0.2.0
# via tensorflow
google-resumable-media==2.7.2
# via
# google-cloud-bigquery
# google-cloud-storage
googleapis-common-protos==1.63.2
# via
# google-api-core
# grpcio-status
greenlet==3.0.3
# via sqlalchemy
grimp==1.3
# via import-linter
grpcio==1.65.5
# via
# google-api-core
# grpcio-status
# tensorboard
# tensorflow
grpcio-status==1.62.3
# via google-api-core
h11==0.14.0
# via httpcore
h5py==3.11.0
# via
# keras
# tensorflow
hdfs==2.7.3
# via kedro-datasets
holoviews==1.19.1
# via kedro-datasets
httpcore==1.0.5
# via httpx
httpx==0.27.0
# via jupyterlab
huggingface-hub==0.17.3
# via
# accelerate
# datasets
# kedro-datasets
# tokenizers
# transformers
humanize==4.10.0
# via pins
ibis-framework==9.0.0
# via kedro-datasets
identify==2.6.0
# via pre-commit
idna==3.7
# via
# anyio
# httpx
# jsonschema
# requests
# snowflake-connector-python
# yarl
import-linter==1.2.6
# via kedro-datasets
importlib-metadata==8.4.0
# via
# build
# dask
# delta-spark
# fiona
# jupyter-client
# jupyter-lsp
# jupyterlab
# jupyterlab-server
# kedro
# markdown
# nbconvert
# pins
importlib-resources==6.4.3
# via
# kedro
# pins
iniconfig==2.0.0
# via pytest
ipykernel==6.29.5
# via
# jupyter
# jupyter-console
# jupyterlab
# qtconsole
ipython==7.34.0
# via
# ipykernel
# ipywidgets
# jupyter-console
# kedro-datasets
ipywidgets==8.1.3
# via jupyter
isodate==0.6.1
# via azure-storage-blob
isoduration==20.11.0
# via jsonschema
jedi==0.19.1
# via ipython
jinja2==3.0.3
# via
# bokeh
# cookiecutter
# dask
# distributed
# jupyter-server
# jupyterlab
# jupyterlab-server
# kedro-datasets
# moto
# nbconvert
# pins
# torch
jmespath==1.0.1
# via
# boto3
# botocore
joblib==1.4.2
# via
# kedro-datasets
# pins
# scikit-learn
json5==0.9.25
# via jupyterlab-server
jsonpointer==3.0.0
# via jsonschema
jsonschema==4.23.0
# via
# jupyter-events
# jupyterlab-server
# nbformat
jsonschema-specifications==2023.12.1
# via jsonschema
jupyter==1.0.0
# via kedro-datasets
jupyter-client==8.6.2
# via
# ipykernel
# jupyter-console
# jupyter-server
# nbclient
# qtconsole
jupyter-console==6.6.3
# via jupyter
jupyter-core==5.7.2
# via
# ipykernel
# jupyter-client
# jupyter-console
# jupyter-server
# jupyterlab
# nbclient
# nbconvert
# nbformat
# qtconsole
jupyter-events==0.10.0
# via jupyter-server
jupyter-lsp==2.2.5
# via jupyterlab
jupyter-server==2.14.2
# via
# jupyter-lsp
# jupyterlab
# jupyterlab-server
# notebook
# notebook-shim
jupyter-server-terminals==0.5.3
# via jupyter-server
jupyterlab==4.2.4
# via
# kedro-datasets
# notebook
jupyterlab-pygments==0.3.0
# via nbconvert
jupyterlab-server==2.27.3
# via
# jupyterlab
# notebook
jupyterlab-widgets==3.0.11
# via ipywidgets
kedro==0.19.7
# via
# kedro-datasets
# kedro-telemetry
.
# via -r requirements.in
kedro-telemetry==0.6.0
# via kedro
keras==3.5.0
# via tensorflow
kiwisolver==1.4.5
# via matplotlib
lazy-loader==0.4
# via kedro-datasets
libclang==18.1.1
# via tensorflow
linkify-it-py==2.0.3
# via panel
locket==1.0.0
# via
# distributed
# partd
lxml==4.9.4
# via kedro-datasets
lz4==4.3.3
# via
# compress-pickle
# dask
markdown==3.7
# via
# panel
# tensorboard
markdown-it-py==3.0.0
# via
# mdit-py-plugins
# panel
# rich
markupsafe==2.1.5
# via
# jinja2
# nbconvert
# werkzeug
matplotlib==3.3.4
# via kedro-datasets
matplotlib-inline==0.1.7
# via
# ipykernel
# ipython
mdit-py-plugins==0.4.1
# via panel
mdurl==0.1.2
# via markdown-it-py
memory-profiler==0.61.0
# via kedro-datasets
mistune==3.0.2
# via nbconvert
ml-dtypes==0.4.0
# via
# keras
# tensorflow
more-itertools==10.4.0
# via kedro
moto==5.0.0
# via kedro-datasets
mpmath==1.3.0
# via sympy
msal==1.30.0
# via
# azure-datalake-store
# azure-identity
# msal-extensions
msal-extensions==1.2.0
# via azure-identity
msgpack==1.0.8
# via
# blosc2
# distributed
multidict==6.0.5
# via
# aiohttp
# yarl
multiprocess==0.70.16
# via datasets
mypy==1.11.1
# via kedro-datasets
mypy-extensions==1.0.0
# via
# black
# mypy
namex==0.0.8
# via keras
nbclient==0.10.0
# via nbconvert
nbconvert==7.16.4
# via
# jupyter
# jupyter-server
nbformat==5.10.4
# via
# jupyter-server
# nbclient
# nbconvert
ndindex==1.8
# via blosc2
nest-asyncio==1.6.0
# via ipykernel
networkx==2.8.8
# via
# grimp
# kedro-datasets
# torch
nodeenv==1.9.1
# via pre-commit
notebook==7.2.1
# via jupyter
notebook-shim==0.2.4
# via
# jupyterlab
# notebook
numexpr==2.10.1
# via tables
numpy==1.26.4
# via
# accelerate
# biopython
# blosc2
# bokeh
# contourpy
# dask
# datasets
# db-dtypes
# geopandas
# h5py
# holoviews
# ibis-framework
# keras
# matplotlib
# ml-dtypes
# numexpr
# opencv-python
# opt-einsum
# pandas
# pandas-gbq
# pyarrow
# scikit-learn
# scipy
# shapely
# tables
# tensorboard
# tensorflow
# transformers
# triad
# xarray
nvidia-cublas-cu12==12.1.3.1
# via
# nvidia-cudnn-cu12
# nvidia-cusolver-cu12
# torch
nvidia-cuda-cupti-cu12==12.1.105
# via torch
nvidia-cuda-nvrtc-cu12==12.1.105
# via torch
nvidia-cuda-runtime-cu12==12.1.105
# via torch
nvidia-cudnn-cu12==9.1.0.70
# via torch
nvidia-cufft-cu12==11.0.2.54
# via torch
nvidia-curand-cu12==10.3.2.106
# via torch
nvidia-cusolver-cu12==11.4.5.107
# via torch
nvidia-cusparse-cu12==12.1.0.106
# via
# nvidia-cusolver-cu12
# torch
nvidia-nccl-cu12==2.20.5
# via torch
nvidia-nvjitlink-cu12==12.6.20
# via
# nvidia-cusolver-cu12
# nvidia-cusparse-cu12
nvidia-nvtx-cu12==12.1.105
# via torch
oauthlib==3.2.2
# via requests-oauthlib
omegaconf==2.3.0
# via kedro
opencv-python==4.5.5.64
# via kedro-datasets
openpyxl==3.1.5
# via kedro-datasets
opt-einsum==3.3.0
# via tensorflow
optree==0.12.1
# via keras
overrides==7.7.0
# via jupyter-server
packaging==24.1
# via
# accelerate
# bokeh
# build
# dask
# datasets
# db-dtypes
# distributed
# geopandas
# google-cloud-bigquery
# holoviews
# huggingface-hub
# ipykernel
# jupyter-server
# jupyterlab
# jupyterlab-server
# kedro-datasets
# keras
# lazy-loader
# nbconvert
# pandas-gbq
# plotly
# pytest
# pytoolconfig
# qtconsole
# qtpy
# snowflake-connector-python
# tables
# tensorboard
# tensorflow
# transformers
# xarray
pandas==2.2.2
# via
# bokeh
# dask
# dask-expr
# datasets
# db-dtypes
# geopandas
# holoviews
# ibis-framework
# kedro-datasets
# pandas-gbq
# panel
# pins
# triad
# xarray
pandas-gbq==0.23.1
# via kedro-datasets
pandocfilters==1.5.1
# via nbconvert
panel==1.4.5
# via holoviews
param==2.1.1
# via
# holoviews
# panel
# pyviz-comms
parse==1.20.2
# via
# behave
# kedro
# parse-type
parse-type==0.6.2
# via behave
parso==0.8.4
# via jedi
parsy==2.1
# via ibis-framework
partd==1.4.2
# via dask
pathspec==0.12.1
# via black
pbr==6.0.0
# via stevedore
pexpect==4.9.0
# via ipython
pickleshare==0.7.5
# via ipython
pillow==9.5.0
# via
# bokeh
# kedro-datasets
# matplotlib
pins==0.8.6
# via ibis-framework
platformdirs==4.2.2
# via
# black
# jupyter-core
# pytoolconfig
# snowflake-connector-python
# virtualenv
plotly==5.23.0
# via kedro-datasets
pluggy==1.5.0
# via
# kedro
# pytest
polars==0.18.15
# via kedro-datasets
portalocker==2.10.1
# via msal-extensions
pre-commit==3.8.0
# via kedro-datasets
pre-commit-hooks==4.6.0
# via kedro
prometheus-client==0.20.0
# via jupyter-server
prompt-toolkit==3.0.47
# via
# ipython
# jupyter-console
proto-plus==1.24.0
# via google-api-core
protobuf==4.25.4
# via
# google-api-core
# googleapis-common-protos
# grpcio-status
# proto-plus
# tensorboard
# tensorflow
psutil==6.0.0
# via
# accelerate
# distributed
# ipykernel
# memory-profiler
# pytest-xdist
ptyprocess==0.7.0
# via
# pexpect
# terminado
py==1.11.0
# via pytest-forked
py-cpuinfo==9.0.0
# via
# blosc2
# tables
py4j==0.10.9.7
# via pyspark
pyarrow==16.1.0
# via
# dask
# dask-expr
# datasets
# db-dtypes
# deltalake
# ibis-framework
# kedro-datasets
# pandas-gbq
# triad
pyarrow-hotfix==0.6
# via
# dask
# ibis-framework
pyasn1==0.6.0
# via
# pyasn1-modules
# rsa
pyasn1-modules==0.4.0
# via google-auth
pycparser==2.22
# via cffi
pydata-google-auth==1.8.2
# via pandas-gbq
pygments==2.18.0
# via
# ipython
# jupyter-console
# nbconvert
# qtconsole
# rich
pyjwt==2.9.0
# via
# msal
# snowflake-connector-python
pyodbc==5.1.0
# via kedro-datasets
pyopenssl==24.2.1
# via snowflake-connector-python
pyparsing==3.1.2
# via matplotlib
pyproj==3.6.1
# via
# geopandas
# kedro-datasets
pyproject-hooks==1.1.0
# via build
pyspark==3.4.3
# via
# delta-spark
# kedro-datasets
pytest==7.4.4
# via
# kedro-datasets
# pytest-cov
# pytest-forked
# pytest-mock
# pytest-xdist
pytest-cov==3.0.0
# via kedro-datasets
pytest-forked==1.6.0
# via pytest-xdist
pytest-mock==1.13.0
# via kedro-datasets
pytest-xdist==2.2.1
# via kedro-datasets
python-dateutil==2.9.0.post0
# via
# arrow
# botocore
# google-cloud-bigquery
# ibis-framework
# jupyter-client
# matplotlib
# moto
# pandas
python-json-logger==2.0.7
# via jupyter-events
python-slugify==8.0.4
# via cookiecutter
pytoolconfig==1.3.1
# via rope
pytz==2024.1
# via
# ibis-framework
# pandas
# snowflake-connector-python
pyviz-comms==3.0.3
# via
# holoviews
# panel
pyyaml==6.0.2
# via
# accelerate
# bandit
# bokeh
# cookiecutter
# dask
# distributed
# huggingface-hub
# jupyter-events
# kedro
# omegaconf
# pins
# pre-commit
# snowflake-snowpark-python
# transformers
pyzmq==26.1.1
# via
# ipykernel
# jupyter-client
# jupyter-console
# jupyter-server
# qtconsole
qtconsole==5.5.2
# via jupyter
qtpy==2.4.1
# via qtconsole
redis==4.6.0
# via kedro-datasets
referencing==0.35.1
# via
# jsonschema
# jsonschema-specifications
# jupyter-events
regex==2024.7.24
# via transformers
requests==2.32.3
# via
# azure-core
# azure-datalake-store
# cookiecutter
# datasets
# fsspec
# gcsfs
# google-api-core
# google-cloud-bigquery
# google-cloud-storage
# hdfs
# huggingface-hub
# jupyterlab-server
# kedro-datasets
# kedro-telemetry
# moto
# msal
# panel
# pins
# requests-mock
# requests-oauthlib
# responses
# snowflake-connector-python
# tensorflow
# transformers
requests-mock==1.12.1
# via kedro-datasets
requests-oauthlib==2.0.0
# via google-auth-oauthlib
responses==0.18.0
# via
# datasets
# moto
rfc3339-validator==0.1.4
# via
# jsonschema
# jupyter-events
rfc3986-validator==0.1.1
# via
# jsonschema
# jupyter-events
rich==13.7.1
# via
# bandit
# cookiecutter
# ibis-framework
# kedro
# keras
rope==1.13.0
# via kedro
rpds-py==0.20.0
# via
# jsonschema
# referencing
rsa==4.9
# via google-auth
ruamel-yaml==0.18.6
# via pre-commit-hooks
ruamel-yaml-clib==0.2.8
# via ruamel-yaml
ruff==0.0.292
# via kedro-datasets
s3fs==2023.1.0
# via kedro-datasets
s3transfer==0.6.2
# via boto3
safetensors==0.4.4
# via
# accelerate
# transformers
scikit-learn==1.5.1
# via kedro-datasets
scipy==1.13.1
# via
# kedro-datasets
# scikit-learn
send2trash==1.8.3
# via jupyter-server
setuptools==73.0.1
# via
# fs
# ipython
# jupyterlab
# pandas-gbq
# pydata-google-auth
# snowflake-snowpark-python
# tensorboard
# tensorflow
shapely==2.0.6
# via geopandas
six==1.16.0
# via
# astunparse
# azure-core
# behave
# bleach
# fiona
# fs
# google-pasta
# hdfs
# isodate
# parse-type
# python-dateutil
# rfc3339-validator
# tensorboard
# tensorflow
# triad
smmap==5.0.1
# via gitdb
sniffio==1.3.1
# via
# anyio
# httpx
snowflake-connector-python==3.12.1
# via snowflake-snowpark-python
snowflake-snowpark-python==1.21.0
# via kedro-datasets
sortedcontainers==2.4.0
# via
# distributed
# snowflake-connector-python
soupsieve==2.6
# via beautifulsoup4
sqlalchemy==2.0.32
# via kedro-datasets
sqlglot==23.12.2
# via ibis-framework
stevedore==5.2.0
# via bandit
sympy==1.13.2
# via torch
tables==3.9.2
# via kedro-datasets
tblib==3.0.0
# via distributed
tenacity==9.0.0
# via plotly
tensorboard==2.17.1
# via tensorflow
tensorboard-data-server==0.7.2
# via tensorboard
tensorflow==2.17.0
# via kedro-datasets
tensorflow-io-gcs-filesystem==0.37.1
# via tensorflow
termcolor==2.4.0
# via tensorflow
terminado==0.18.1
# via
# jupyter-server
# jupyter-server-terminals
text-unidecode==1.3
# via python-slugify
threadpoolctl==3.5.0
# via scikit-learn
tinycss2==1.3.0
# via nbconvert
tokenizers==0.15.2
# via transformers
toml==0.10.2
# via
# import-linter
# kedro
tomli==2.0.1
# via
# black
# build
# coverage
# jupyterlab
# mypy
# pre-commit-hooks
# pytest
# pytoolconfig
tomlkit==0.13.2
# via snowflake-connector-python
toolz==0.12.1
# via
# dask
# distributed
# ibis-framework
# partd
torch==2.4.0
# via
# accelerate
# transformers
tornado==6.4.1
# via
# bokeh
# distributed
# ipykernel
# jupyter-client
# jupyter-server
# jupyterlab
# notebook
# terminado
tqdm==4.66.5
# via
# datasets
# huggingface-hub
# panel
# transformers
traitlets==5.14.3
# via
# comm
# ipykernel
# ipython
# ipywidgets
# jupyter-client
# jupyter-console
# jupyter-core
# jupyter-events
# jupyter-server
# jupyterlab
# matplotlib-inline
# nbclient
# nbconvert
# nbformat
# qtconsole
transformers==4.35.2
# via kedro-datasets
triad==0.9.8
# via kedro-datasets
triton==3.0.0
# via torch
trufflehog==2.2.1
# via kedro-datasets
trufflehogregexes==0.0.7
# via trufflehog
types-cachetools==5.5.0.20240820
# via kedro-datasets
types-cffi==1.16.0.20240331
# via types-pyopenssl
types-decorator==5.1.8.20240310
# via kedro-datasets
types-pyopenssl==24.1.0.20240722
# via types-redis
types-python-dateutil==2.9.0.20240821
# via arrow
types-pyyaml==6.0.12.20240808
# via kedro-datasets
types-redis==4.6.0.20240819
# via kedro-datasets
types-requests==2.31.0.6
# via kedro-datasets
types-setuptools==72.2.0.20240821
# via types-cffi
types-six==1.16.21.20240513
# via kedro-datasets
types-tabulate==0.9.0.20240106
# via kedro-datasets
types-urllib3==1.26.25.14
# via types-requests
typing-extensions==4.12.2
# via
# aioitertools
# anyio
# async-lru
# azure-core
# azure-identity
# azure-storage-blob
# black
# huggingface-hub
# ibis-framework
# kedro
# mypy
# optree
# panel
# snowflake-connector-python
# snowflake-snowpark-python
# sqlalchemy
# tensorflow
# torch
tzdata==2024.1
# via pandas
uc-micro-py==1.0.3
# via linkify-it-py
uri-template==1.3.0
# via jsonschema
urllib3==1.26.19
# via
# botocore
# distributed
# requests
# responses
# snowflake-connector-python
virtualenv==20.26.3
# via pre-commit
wcwidth==0.2.13
# via prompt-toolkit
webcolors==24.8.0
# via jsonschema
webencodings==0.5.1
# via
# bleach
# tinycss2
websocket-client==1.8.0
# via jupyter-server
werkzeug==3.0.3
# via
# moto
# tensorboard
wheel==0.44.0
# via
# astunparse
# snowflake-snowpark-python
widgetsnbextension==4.0.11
# via ipywidgets
wrapt==1.16.0
# via
# aiobotocore
# tensorflow
xarray==2024.7.0
# via kedro-datasets
xlsx2csv==0.8.3
# via polars
xlsxwriter==1.4.5
# via kedro-datasets
xmltodict==0.13.0
# via moto
xxhash==3.5.0
# via
# datasets
# pins
xyzservices==2024.6.0
# via
# bokeh
# panel
yarl==1.9.4
# via aiohttp
zict==3.0.0
# via distributed
zipp==3.20.0
# via
# importlib-metadata
# importlib-resources
In fact, the uv
resolution can be fed to pip
:
# pip install -r requirements.txt
...
Successfully built antlr4-python3-runtime docopt grimp hdfs import-linter pyspark kedro-datasets
Installing collected packages: ...
So a dependency solution exists - it's just that pip
cannot resolve it.
Admittedly, the test dependencies for kedro-datasets
are contrived, but there's no conflict - the installation failure stems from a limitation in the pip resolver.
And yet, our Makefile uses pip
, so contributors will keep hitting this wall.
Are we ready to change our Makefile to use uv
?
Btw another contributor was blocked by this https://github.com/kedro-org/kedro-plugins/pull/807#issuecomment-2298757343
Data point...trying to work on a PR for #808 and ran in to this issue...I'll try uv
Using python 3.10, pip 24.2 (latest)
Well, every single kedro-datasets contributor is being blocked by this, so I'm liberally assigning High priority.
As @ankatiyar pointed out on a meeting today, we're now using uv
on our own CI, so the make install-test-requirements
is basically untested ⚠️
https://github.com/kedro-org/kedro-plugins/blob/b766f45f8660a8f49d7f4eba30f5e556b89fc8c6/.github/workflows/unit-tests.yml#L47-L51
We can do several things:
- We change our
Makefile
so that it usesuv
, and put it back on our CI. - If we're not ready to endorse
uv
just yet, at least we should adjust our documentation guides, and potentially remove the command too.
Thoughts?
I have actually fixed uv
with https://github.com/kedro-org/kedro-plugins/pull/464 already, the remaining bit is the system dependencies that are needed for OpenCV. Honestly I don't think user should be bother about this, unless they are working with the VideoDataest. We can update our docs to make this clearer.
I see we merged this a few days after we last spotted the problem https://github.com/kedro-org/kedro-plugins/issues/597#issuecomment-2303535015 let's close the issue
Sounds good, I have added a note in the installation guide (Github Wiki).