clearml
clearml copied to clipboard
Bug: Hyperparameter optimisation does not clone docker settings from template task
I followed the example for Hyperparameter optimisation in the documentation. While some import statements are missing, I could manage to set up a running script and optimise a template task. Pretty cool! :) However, it seems that the default execution of that base task is not fully copied, especially the docker settings.
Expected behaviour: Copy docker settings from template task to optimisation task
Actual behaviour: I see ubuntu:18.04 image and no docker arguments in the execution page.
Thanks for reporting @MoPl90, We'll take a look at what seems to be going wrong.
Hi @MoPl90,
First, as for the docs, they are not supposed to have full code, it links to the full code (that should work out of the box here) hope this is clear!
As for the issue, I added this to my base task:
task.set_base_docker(docker_image="ubuntu:18.04",docker_arguments='-e ENV=1',docker_setup_bash_script=['apt update'])
This set the docker, args and bash script (just dummy data).
Then on the cloned experiments, not the HPO controller, IE the experiments that the HPO processs spawns (and not the task called Automatic Hyper-Parameter Optimization) you do see this information, see below:
So I think it should work.
If this doesn't work for you, can you please let me know what SDK and server version you're using?
Hi,
thanks for your reply. If I set the image and options explicitly via Task.set_base_docker
it works.
However, if I don't specify this explicitly, the agent neither uses the template task's docker settings, nor the clearml.conf default settings.
@MoPl90 What versions of clearml python package and clearml server are you using? Also, how are you specifying the docker image in the template task? BTW, is the clearml agent running in docker mode? Maybe a silly question but worth asking :smile:
We are running version 1.6.2 of the python package, and the agents are version 1.3.0 running in docker mode. The server is on versions: 1.1.1-135 • 1.1.1 • 2.14.
I have a docker image specified in the clearml.conf file (which is ignored), and I manually added the image to the template task in the UI (which is also ignored). The correct image is only used if I use Task.set_base_docker
in the HPO script.
@MoPl90, In the conf file, are you using the agent.default_docker ? If so, it won't register unless you're running the experiment in using clearml agent. This is why it's not registered. As for adding the docker to template task from the UI, how are you doing it? Once a task is "completed" you can't add a docker image to the task.
In the conf file, are you using the agent.default_docker?
Yes, exactly. And it seems that this field is ignored by the agents, since the "image" field in the Execution panel of the UI is empty unless I specify the container via Task.set_base_docker
.
it won't register unless you're running the experiment in using clearml agent
Not sure I understand. If I have a base experiment using a custom container (say I specified it via Task.set_base_docker
, I would expect the HPO experiments to copy those settings (as it is the case for all other execution arguments). At the moment I have to specify the container manually by calling Task.set_base_docker
in the HPO experiment again.