clearml
clearml copied to clipboard
Incorrect docker environment setup
Describe the bug
ClearML does not find all necessary packages so I add manually my requirments.txt to the experiment:
clm.Task.add_requirements("requirements.txt")
Now, the listed env packages associated with the task are correct. But if I clone the task and enqueue
it something strange happens. The docker environment is missing packages listed int the original pip:
original pip:
# Python 3.11.8 (main, Feb 26 2024, 21:39:34) [GCC 11.2.0]
PyYAML ==6.0.1
clearml == 1.15.1
docstring_parser ==0.16
importlib_resources ==6.4.0
jsonargparse ==4.27.7
lightning ==2.2.3
matplotlib
numpy ==1.26.4
pandas
pillow == 10.2.0
torch ==2.3.0+cu118
torchmetrics ==1.3.2
torchvision ==0.18.0+cu118
tqdm
typeshed_client ==2.5.1
typing_extensions ==4.11.0
zipp ==3.18.1
pip:
attrs==23.2.0
certifi==2024.2.2
charset-normalizer==3.3.2
clearml==1.15.1
Cython==3.0.10
distlib==0.3.8
filelock==3.13.4
furl==2.1.3
idna==3.7
jsonargparse==4.27.7
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
orderedmultidict==1.0.1
pandas==2.2.2
pathlib2==2.3.7.post1
pillow==10.2.0
platformdirs==4.2.1
psutil==5.9.8
PyJWT==2.8.0
pyparsing==3.1.2
python-dateutil==2.8.2
pytz==2024.1
PyYAML==6.0.1
referencing==0.35.0
requests==2.31.0
rpds-py==0.18.0
six==1.16.0
torchvision==0.18.0+cu118
tzdata==2024.1
urllib3==1.26.18
virtualenv==20.26.0
zipp==3.18.1
The task fails, because no torch and lightning installed... Interestingly, torchvision is installed, so I have no clue what could go wrong.
I have attached the complete log:
task_cab63ae315ba4a7690ea77248d1e9e48.log
Environment
Self-hosted WebApp: 1.15.0-472 • Server: 1.15.0-472 • API: 2.29 Python Version 3.11 Linux Ubuntu 22
Hi @terbed , the original pip you attached was created during the local run of the task, before running it remotely?
Yes, that is true. More precisely, the remote run stores the original pip packages too, that one is copied here. But the original pip is generated with a local task running.