clearml
clearml copied to clipboard
remote execution with local python package
Hi,
demo project clearml demo server runs
I have a project with local package involving source compilation that need to be install via pip.
The requirements.txt specifies the local package as:
./local_package
I'd like to run clearml-agent on such project but the package detection stage seems to only report the name of the local package and not the location as it would do for packages installed from a git repository.
So when I try to execute such a task with clearml-agent, the installation fails:
ERROR: Could not find a version that satisfies the requirement my_local_package==0.0.0 (from -r /tmp/cached-reqsj92bjyak.txt (line 4)) (from versions: none)
ERROR: No matching distribution found for my_local_package==0.0.0 (from -r /tmp/cached-reqsj92bjyak.txt (line 4))
clearml_agent: ERROR: Could not install task requirements!
I can still edit on the webui the install packages to add ./local_package
but this is cumbersome. Moreover, after exectution, the install packages field stores an absolute path for this local package which can break replication on another worker.
A solution would be to move local_package on a git repository and install it through pip install git+...
but my team is not willing to do it for their own reasons.
What do you think ?
Hi @elinep
In your code, before calling Task.init
add the following line:
Task.add_requirements("./local_package")
If they do agree to create a git repo :) you would just add the git directly with:
Task.add_requirements("git+https://github.com/...")
Notice, please test with the latest RC, I remember there was a fix to improve support for local packages:
pip install clearml==0.17.5rc5
Thanks @bmartinn.
I guess I'll have to manage the task requirements manually as you suggest.
What about the fact that once executed the task requirements is modified with worker internal path for local packages:
# draft task
./my_local_package
# once executed by clearml-agent
my_local_package @ file:///.../.clearml/venvs-builds/3.8/task_repository/trains_gitsm_local_install.git/my_local_package
This task is now likely to fail (or run with a wrong version) if we try to run it on another worker as the path for my_local_package is probably not valid.
This task is now likely to fail (or run with a wrong version) if we try to run it on another worker as the path for my_local_package is probably not valid.
You have a very good point, this is definitely a bug.
I'm updating here once the clearml-agent
fixes this issue (basically updating back should replace back the local package link)
Hi @elinep A fix was pushed, if you feel like testing before the RC is out :)
pip3 install git+https://github.com/allegroai/clearml-agent.git
Just updating here that an RC is out :)
pip install clearml==-0.17.6rc1
git+https://github.com/
I wish there was a way to link a specific remote git branch. Tried
git+https://github.com/...#branch
and the /tree/branch
link, both didn't work. Had to link to a local clone branch checkout
I wonder how clearml resolves in poetry
mode the poetry
deps like:
river = { git = "https://github.com/ColdTeapot273K/river.git", branch = "feature/mini-batch-support" }
(Can't check since poetry
mode is not working for me rn, see #545)
Hi @ColdTeapot273K
I wish there was a way to link a specific remote git branch. Tried
There ism the following should work (not please make sure pip >= 20
), in tour "Installed Packages" you can add this line
git+https://github.com/allegroai/clearml.git@12fa7c92aaf8770d770c8ed05094e924b9099c16
Which would install the clearml
package directly from the git repository at the commit-id 12fa7c92aaf8770d770c8ed05094e924b9099c16
You can also replace the specific commit id with branch name, both will work :)
Hi @elinep In your code, before calling
Task.init
add the following line:Task.add_requirements("./local_package")
This is not correct, you can add a requirements.txt file this way, not a local python package file.
Is this still not resolved? I am also facing this issue
ERROR: Could not find a version that satisfies the requirement google_cloud_storage==2.9.0 (from -r /tmp/cached-reqs_xokg8xa.txt (line 2)) (from versions: 0.20.0, 0.21.0, 0.22.0, 0.23.0, 0.23.1, 1.0.0, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.10.0, 1.11.0, 1.11.1, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 1.13.2, 1.13.3, 1.14.0, 1.14.1, 1.15.0, 1.15.1, 1.15.2, 1.16.0, 1.16.1, 1.16.2, 1.17.0, 1.17.1, 1.18.0, 1.18.1, 1.19.0, 1.19.1, 1.20.0, 1.21.0, 1.22.0, 1.23.0, 1.24.0, 1.24.1, 1.25.0, 1.26.0, 1.27.0, 1.28.0, 1.28.1, 1.29.0, 1.30.0, 1.31.0, 1.31.1, 1.31.2, 1.32.0, 1.33.0, 1.34.0, 1.35.0, 1.35.1, 1.36.0, 1.36.1, 1.36.2, 1.37.0, 1.37.1, 1.38.0, 1.39.0, 1.40.0, 1.41.0, 1.41.1, 1.42.0, 1.42.1, 1.42.2, 1.42.3, 1.43.0, 1.44.0, 2.0.0) ERROR: No matching distribution found for google_cloud_storage==2.9.0 (from -r /tmp/cached-reqs_xokg8xa.txt (line 2)) clearml_agent: ERROR: Could not install task requirements! Command '['/root/.clearml/venvs-builds/3.6/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqs_xokg8xa.txt']' returned non-zero exit status 1.
@hotshotdragon this issue is related to local packages, not to packages found in pypi. The error you see is because the agent is likely using Python 3.7, for which google_cloud_storage is only supported up to version 2.0.0
@jkhenning Thanks for the quick response, I found a temporary fix to the issue. But I am using 3.10 everywhere, not sure how the agent is using 3.7. Also I am not using google cloud storage anywhere, so why it is getting picked up, I am not sure about that as well.
Another thing, every time I do a local run and then clone that experiment to run from agent, it gives error of module not found. Issue is mentioned here https://github.com/allegroai/clearml/issues/503
It's possible this is the only python version that agent finds in the docker container it's using?
In console I found this "Python executable with version '3.10' requested by the Task, not found in path, using '/usr/bin/python3' (v3.6.9) instead"
maybe this is causing issue. I am not sure how can I change the py version
Hi @hotshotdragon, this simply means python version 3.10 was not installed on the docker image used to run the code - if you use an image with 3.10 installed, the agent should be able to find it and use it as required
I solved it. I was missing task.set_base_docker method
Wow he hijacks the issue with somethign off topic and then you close it @jkhenning
Apologies @JeremyMahieu ! 🙏 🙏 I honestly lost track 🙁 - reopening, of course
For anyone still wondering, I ran into the same issue with my local module and was able to force the agent to correctly build it on different machine by the following:
Task.ignore_requirements("my_module")
Task._force_requirements[
"my_module @ file:///${PROJECT_ROOT}/../../my_module"
] = None
However, as I mentioned in https://github.com/allegroai/clearml-agent/issues/191, this feels quite hacky.