clearml-agent icon indicating copy to clipboard operation
clearml-agent copied to clipboard

Unable to set daemon with `--cpu-only`

Open hadyan-tvlk opened this issue 4 years ago • 3 comments

Dear ClearML Community,

I'm trying to set ClearML agent running on my GPU instance, acting as worker that will accept process CPU bound, the command is like following:

clearml-agent daemon --queue cpu --docker --force-current-version &

However, immediately it'll return error like following:

clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server http://HOST_IP:8082 ?

Interestingly, when i setting up the daemon with --gpus, using following command:

clearml-agent daemon --queue default --force-current-version --gpus 3 &

It can run successfully.

Worker "gpuserver:gpu3" - Listening to queues:
+----------------------------------+---------+-------+
| id                               | name    | tags  |
+----------------------------------+---------+-------+
| 6772bc62a433f80f56bee080f4       | default |       |
+----------------------------------+---------+-------+

Not sure why running on CPU mode is not working, am i missing something? Thanks in advance!

P.S: I changed the exposed port to 8082, instead 8008

Installed Versions

  • clearml==1.0.1
  • clearml-agent==1.0.0

hadyan-tvlk avatar May 11 '21 09:05 hadyan-tvlk

Hi @hadyan-tvlk If I understand correctly the title should be "Unable to set daemon with --force-current-version ", right ? :) I could not reproduce this error, can you still reproduce it ? What is the agent's host machine OS & Python version ?

P.S: I changed the exposed port to 8082, instead 8008

I'm assuming this is reflected in your clearml.conf ? Can I assume the web / agent are working correctly ?

bmartinn avatar May 12 '21 19:05 bmartinn

Hi @bmartinn,

sorry for the late response,

Nope, the error is actually related to --cpu-only configuration, not --force-current-version

If I understand correctly the title should be "Unable to set daemon with --force-current-version ", right ? :)

What do you mean by this?

I'm assuming this is reflected in your clearml.conf ?

Yes, the webserver and fileserver are accessible

Can I assume the web / agent are working correctly ?

I still can reproduce the problem. It works well with --gpus argument, but not with --cpu-only

The error is like following:

clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server http://HOST_IP:8082 ?

I could not reproduce this error, can you still reproduce it ? What is the agent's host machine OS & Python version ?

Agent's host machine OS is Linux, The Python version 3.7.10.

What is the agent's host machine OS & Python version ?

hadyan-tvlk avatar May 18 '21 10:05 hadyan-tvlk

I'm assuming this is reflected in your clearml.conf ?

I just wanted to make sure that in your cleaml.conf you have the api_server configured with the 8082 port and not the default 8008. For example:

        # CLEARML-AGENT configuration file
        api {
            api_server: http://HOST:8082
            web_server: http://HOST:8080
            files_server: http://HOST:8081 
        }

The error points to a failure to connect with the api server, can you verify you can run

curl http://<HOST_HERE>:8082

What's the result ?

bmartinn avatar May 20 '21 00:05 bmartinn