clearml icon indicating copy to clipboard operation
clearml copied to clipboard

Incorrect docker environment setup

Open terbed opened this issue 2 months ago • 2 comments

Describe the bug

ClearML does not find all necessary packages so I add manually my requirments.txt to the experiment:

clm.Task.add_requirements("requirements.txt")

Now, the listed env packages associated with the task are correct. But if I clone the task and enqueue it something strange happens. The docker environment is missing packages listed int the original pip: original pip:

# Python 3.11.8 (main, Feb 26 2024, 21:39:34) [GCC 11.2.0]

PyYAML ==6.0.1
clearml == 1.15.1
docstring_parser ==0.16
importlib_resources ==6.4.0
jsonargparse ==4.27.7
lightning ==2.2.3
matplotlib
numpy ==1.26.4
pandas
pillow == 10.2.0
torch ==2.3.0+cu118
torchmetrics ==1.3.2
torchvision ==0.18.0+cu118
tqdm
typeshed_client ==2.5.1
typing_extensions ==4.11.0
zipp ==3.18.1

pip:

attrs==23.2.0
certifi==2024.2.2
charset-normalizer==3.3.2
clearml==1.15.1
Cython==3.0.10
distlib==0.3.8
filelock==3.13.4
furl==2.1.3
idna==3.7
jsonargparse==4.27.7
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
orderedmultidict==1.0.1
pandas==2.2.2
pathlib2==2.3.7.post1
pillow==10.2.0
platformdirs==4.2.1
psutil==5.9.8
PyJWT==2.8.0
pyparsing==3.1.2
python-dateutil==2.8.2
pytz==2024.1
PyYAML==6.0.1
referencing==0.35.0
requests==2.31.0
rpds-py==0.18.0
six==1.16.0
torchvision==0.18.0+cu118
tzdata==2024.1
urllib3==1.26.18
virtualenv==20.26.0
zipp==3.18.1

The task fails, because no torch and lightning installed... Interestingly, torchvision is installed, so I have no clue what could go wrong.

I have attached the complete log:

task_cab63ae315ba4a7690ea77248d1e9e48.log

Environment

Self-hosted WebApp: 1.15.0-472 • Server: 1.15.0-472 • API: 2.29 Python Version 3.11 Linux Ubuntu 22

terbed avatar Apr 26 '24 12:04 terbed

Hi @terbed , the original pip you attached was created during the local run of the task, before running it remotely?

jkhenning avatar May 02 '24 15:05 jkhenning

Yes, that is true. More precisely, the remote run stores the original pip packages too, that one is copied here. But the original pip is generated with a local task running.

terbed avatar May 03 '24 08:05 terbed