clearml-agent icon indicating copy to clipboard operation
clearml-agent copied to clipboard

Feature request: change ssh user on agent

Open milongo opened this issue 4 years ago • 5 comments

I have a git repository whose's SSH credentials do not begin with the usual ssh://git@.... Instead, it is ssh://root@git..... It seems like the agent's behavior to clone repositories using SSH credentials is hardcoded to use git@.....

More specifically, doing git clone ssh://[email protected]/repo/repo.git results in permision denied, whereas doing git clone ssh://[email protected]/repo/repo.git works just fine.

Can you please add a way to modify the SSH user for the agent?

Thanks.

milongo avatar Jan 08 '21 13:01 milongo

Thanks @milongo, you are correct there is currently no way to configure the git user the SSH clone is using (well other than changing the original repository link in the Task, but that will scale badly)

I think we should add a configuration argument for the agent section in the ~/clearml.conf file

  1. Since this issue is actually a feature request for clearml-agent, I'll make sure we move it to the correct repository
  2. I'll update once we have an RC with a fix

bmartinn avatar Jan 08 '21 23:01 bmartinn

Hi @milongo Good news the new RC includes this feature. Add to your ~/clearml.conf the following line:

agent.force_git_ssh_user = "git"

And upgrade the clearml-agent to the latest RC:

pip install clearml-agent==0.17.2rc2

bmartinn avatar Feb 15 '21 15:02 bmartinn

@bmartinn should this work agent-side in clearml-agent==1.0.0?

I have:

$ pip freeze | grep clearml
clearml==1.0.4
clearml-agent==1.0.0

and in clearml.conf:

...
agent: {
    force_git_ssh_user: "git"
    force_git_ssh_protocol: true
    default_docker {
        image: "my_image:latest"
    }
...
}

Now after running Task.init locally, I see in the UI this URL pattern:

ssh://my_server_url:<ssh_port>/<user>/<repo>.git

and the execution fails with:

...
cloning: ssh://my_server_url:<ssh_port>/<user>/<repo>.git
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
...

Should this work? After adding force_git_ssh_user: "git", should I instead see cloning: ssh://git@my_server_url:<ssh_port>/<user>/<repo>.git, or is that handled somewhere in the backend?

EDIT/Sidenote: When updating the repo URL with the user in the UI, it works. So I do not think it is a general permission/misconfiguration issue.

sgasse avatar Jul 26 '21 15:07 sgasse

Sry if I misunderstand - I am still new to ClearML :slightly_smiling_face: . Though is the commit you referenced @bmartinn really addressing the issue that @milongo mentioned?

If I see correctly, the referenced commit rewrites how https git repo urls are translated into ssh repo urls. But the issue raised by you @milongo is about 'userless' ssh clone links, right?

At least we have the issue that running ClearML in our git repo creates links like:

ssh://my_server_url:<ssh_port>/<user>/<repo>.git

and we do not want to fix them all by hand. Alternative solutions could be changing what repo url is written when a Task is initialized, but solving this 'agent-side' would be better.

sgasse avatar Jul 26 '21 15:07 sgasse

OK I found two workarounds, both rather hacky. If you need a solution, you can use it, but I personally would prefer a cleaner way.

Both rely on having a ~/.ssh/config file like the one below:

Host git.mycompany.com
  HostName 123.123.123.123
  IdentityFile ~/.ssh/id_rsa
  IdentitiesOnly yes
  User git
  Port 2022

One gotcha is that you need to spell out the domain as in the ssh url. People often use abbreviations like Host gitserver and then state the domain as HostName. However in this case, you need the full domain as in the url as Host and the IP of the server (might work with DNS, have not tested) as HostName.

You can test it outside of the docker container by running e.g. git clone ssh://git.mycompany.com:2022/myuser/repo.git. This should work outside of the container.

For making this available in the docker container, we have two options.

Option 1: Own config/key with root, mount it manually The docker containers executed by the clearML agent run as root. Without any changes, ssh will complain about a bad owner of the config file, even when file permissions are 600. To circumvent this, you can own the config and key as root. Outside of the docker container:

sudo chown root:root ~/.ssh/config && sudo chown root:root ~/.ssh/id_rsa

Though now you will get an error (when running the clearML agent with --foreground) because your user can no longer copy the whole .ssh folder:

Failed creating temporary copy of ~/.ssh for git credential

So in your clearml.conf, you need to mount the credentials manually:

agent: {
    ...
    extra_docker_arguments: ["-v", "/home/datapipelineuser/.ssh:/root/.ssh"]
    ...
}

Option 2: Own the ssh credentials in the container Instead of owning the credentials outside and mounting them manually, you can add a command to own the credentials in the container. Adding

agent: {
    ...
    extra_docker_shell_script: ["chown -R root:root /root/.ssh"]
    ...
}

to your clearml.conf should do the trick. I prefer the second solution, but overall, the best would be if we could either have a user set when creating the task or overriding the user even when a ssh:// domain is given.

sgasse avatar Jul 27 '21 08:07 sgasse

@milongo can we close this?

jkhenning avatar Mar 15 '23 13:03 jkhenning

Hi @jkhenning, we can! I'm closing it now.

milongo avatar Mar 15 '23 13:03 milongo