cbrain icon indicating copy to clipboard operation
cbrain copied to clipboard

Support for private Docker containers

Open glatard opened this issue 8 years ago • 23 comments

Currently, only Docker containers that are publicly accessible on DockerHub are supported. To support private ones, CBRAIN should be able to authenticate to DockerHub using its own credentials, that should be defined in the Portal, Bourreau or Tool Config. Users could then grant CBRAIN access to their containers without making them public.

Currently it seems that the only way to login to DockerHub is with login and password. I guess these will have to be stored in clear and travel with the task.

This feature should be implemented as follows:

  • store the username and password of a "cbrain" docker account somewhere in the database (I don't think it should be specific to any tool or Bourreau).
  • before running the task, i.e. just before doing docker pull and docker run, add a docker login -u xxx -p yyy command.

glatard avatar Sep 29 '16 22:09 glatard

Private container is a requirement for an upcoming tool.

merous avatar Oct 20 '16 14:10 merous

Well I'm working on #133 right now so I'll have to look at this next week.

prioux avatar Oct 20 '16 16:10 prioux

If there is no other method than "docker login" with a password on the command-line, I'm not even sure I want such a feature in CBRAIN. It is a terrible security architecture.

prioux avatar Oct 22 '16 17:10 prioux

I think until there is a better mechanism on the docker side, CBRAIN should only allow public docker image. If your image is not public, too bad...

prioux avatar Oct 22 '16 17:10 prioux

It's bad security on the side of Dockerhub, does it present a risk for CBRAIN? As long as we trust the container itself, is there any harm?

merous avatar Oct 22 '16 18:10 merous

The problem is that CBRAIN is entrusted with the dockerhub credentials of an ordinary CBRAIN user. The username and password of that user on dockerhub kept in plain text within CBRAIN. That's really not good. If CBRAIN gets compromised the hackers get those. And they will have to travel back and forth between CBRAIN and the clusters too: if the cluster gets compromised, they get the password, again.

prioux avatar Oct 22 '16 18:10 prioux

Understood. Still, the Dockerhub user already accepts this situation, or should be made aware of the risks. What are the other options? Build and keep the Docker images locally on a CBRAIN data provider?

merous avatar Oct 22 '16 18:10 merous

Is it not possible to have a version of the container directly on the Cluster, like having a tool installed on the cluster. If I remember it was a discussion about this, because this prevent to pull images every time we launch a task.

natacha-beck avatar Oct 23 '16 03:10 natacha-beck

CBRAIN won't store the credentials of any user, just its own. Users who want to use a private Docker image in CBRAIN would have to grant access to user 'cbrain' on Dockerhub.

I agree that storing the CBRAIN password in clear text is not ideal, but we also have to be realistic here. These containers won't be worth a million dollar, and if they are, they shouldn't be on DockerHub anyway. Also, if CBRAIN gets hacked, then the hackers could access the (passwordless) private key, which is in my opinion much more annoying that a DockerHub password.

If we want to go for a stronger security architecture, then let's encrypt the password with CBRAIN's public key, keep it encrypted in the DB, decrypt it on the worker node just before docker login is done and destroy it just after. We should also destroy ~/.docker and ~/.dockercfg when docker pull has been done. Then the password will remain stored encrypted in the DB and hackers couldn't read it unless they have access to the private key. If CBRAIN gets compromised, we just have to change the DockerHub password and that's it. Sounds good?

glatard avatar Oct 24 '16 16:10 glatard

Okay, that makes things clearer. Now that thing is, cbrain doesn't have a user on dockerhub. Maybe we should create one? (I assume that's what you meant by grant access to user 'cbrain' on Dockerhub ? )

prioux avatar Oct 24 '16 16:10 prioux

Yes, we should create one and give it read access to the mcin organization.

If docker login fails we should still try to download the container with docker pull (as already done) in case the container is public. I don't think there's a way to check programatically if a container is private.

glatard avatar Oct 24 '16 16:10 glatard

Hi folks, is anyone working on that at the moment? It's required for an application that needs to be ported.

glatard avatar Dec 08 '16 20:12 glatard

I haven't worked on this, no, but it's feasible for sure. I'm just wondering where we put the docker username and password... in the ToolConfig object? In the Tool object? Do we want to support getting images that can come from different sources depending on the tool config version? I think the tool config is the ideal place.

I also think we have to make sure the .docker and .dockercfg are NOT create in $HOME, I hope there's a way to override this. What if the feature is used by multiple tasks simultaneously, they can't be allowed to interfere.

prioux avatar Dec 08 '16 21:12 prioux

I would say the docker credentials should be properties of BrainPortal. In this way, if people want to declare their private container in CBRAIN they just have to give access to a single account, always the same one. It makes things easier than having to handle an account per tool or toolconfig. It's also less configuration in the toolconfig. In the future we could always extend it if required.

glatard avatar Dec 08 '16 21:12 glatard

Hum, I see. Yes, if of course I was thinking of the reverse situation again, but we only need one account. Ok, seems feasible. It will be ready in June or July.

prioux avatar Dec 08 '16 22:12 prioux

(Joking)

prioux avatar Dec 08 '16 22:12 prioux

Also, I think we should try for a public image first, and then if it fails log in and try to see if it's accessible.

prioux avatar Dec 08 '16 22:12 prioux

Wee need a singleton object in CBRAIN that represent the entire CBRAIN installation; I have used in the past the first BrainPortal object, but that's not necessarily the best place, since a CBRAIN installation can have multiple distinct portals.

prioux avatar Dec 14 '16 21:12 prioux

It would be a great place to store global information about the entire system, including the docker hub credentials.

prioux avatar Dec 14 '16 21:12 prioux

I see I have to re-engineer many things about the docker integration in cluster_task.rb, the code that's there is error-prone. For instance, re-invoking cluster_commands from within docker_commands, when it was already invoked from within submit_cluster_job ... if cluster_commands has side effect or is not repeatable within the same run number, this will crash.

prioux avatar Dec 14 '16 21:12 prioux

@prioux indicates that the last comment has been completed, what is left to do:

  1. Implement public first, then pull with credentials.
  2. Explore if Dockerhub provides a way to identify with a Key.
  3. Create a Dockerhub CBRAIN identity and implement its use for pulling private images.
  4. Provide procedure of users to grant CBRAIN access in dockerhub.
  5. Deploy in all instances of CBRAIN.

shots47s avatar Dec 19 '17 18:12 shots47s

Since docker is not really supported on the supercomputers, I'm lowering this to low priority.

In the meantime, we can use the procedure described in #295 and allow a user to register a private docker image directly in CBRAIN.

prioux avatar Feb 19 '18 22:02 prioux

Should we close this one ?

natacha-beck avatar Aug 25 '21 14:08 natacha-beck