cml icon indicating copy to clipboard operation
cml copied to clipboard

Support cml-runner installation on a system without internet connection

Open mnrozhkov opened this issue 4 years ago • 11 comments

For enterprise usage of CML we often have to set up self-hosted cml-runners to be run on a server without an internet connection. Usually, it's due to some security issues. In such cases, it is not easy to set up everything. The current way to install cml-runner requires Internet access to install npm dependencies + it automatically downloads the latest version of github/gitlab runners.

This issue requests a way to install and set up CML (cml-runners) on a system without an Internet connection.

Priority:

  • Red Hat Enterprise Linux 7
  • 64-bit AMD / 64-bit Intel

mnrozhkov avatar Sep 27 '21 12:09 mnrozhkov

@mnrozhkov I'm curious about your setup, do your self-hosted runners have access to an internal docker registry or an internal npm cache?

I think I little more information could help determine the best way to approach your problem.

It used to be common to self-host an npm cache a long time ago to deal with instability, if you are able to set up one in a DMZ that is probably the fastest solution.

dacbd avatar Sep 27 '21 16:09 dacbd

I presume the idea would be install (npm install @dvcorg/cml) locally and then copy the node_modules folder to an on-prem machine which interacts with an on-prem gitlab server.

related to #615 and somewhat related to #607

casperdcl avatar Sep 28 '21 22:09 casperdcl

I presume the idea would be install locally and then copy the node_modules folder to an on-prem machine which interacts with an on-prem gitlab server.

related to #615 and somewhat related to #607

To pass a complete bundle when cml-runner creates/registers itself?

dacbd avatar Sep 28 '21 22:09 dacbd

To pass a complete bundle when cml-runner creates/registers itself?

I don't follow. I think this is only about installing CML on a machine which doesn't have internet access. At the moment this will work... except for the cml runner subcommand, which requires an internet connection to download dependencies at runtime.

casperdcl avatar Sep 28 '21 23:09 casperdcl

I don't follow. I think this is only about installing CML on a machine which doesn't have internet access. At the moment this will work... except for the cml runner subcommand, which requires an internet connection to download dependencies at runtime.

That is what I interpreted this as since @mnrozhkov specifically mentions cml-runner

dacbd avatar Sep 28 '21 23:09 dacbd

@DanielBarnes let me give a brief explanation of the setup. Actually, it is not my working configuration but a common situation I see in many teams in large companies.

  1. There is a server_A without Internet access due to security reasons (it has access to our data in our internal storage)
  2. Everything we need to install, we need the first download manually via another server_B (it has internet access) and upload to server_A (including Docker and Python packages)
  3. There is an on-premise GitLab in our internal Network (so we can access it from server_A)
  4. We don't have npm installed on server_A (so, it also should be installed as described in step 2). Of course, npm is not able to download anything from the Internet at runtime.

The only option to use CML is to install GitLab Runners manually and set up CI with CML functions. I would like to understand, is there a way to simplify the installation of the CML? Or, we need to proceed with native GitLab Runners?

mnrozhkov avatar Sep 29 '21 05:09 mnrozhkov

I can't speak that, but I believe supporting GitHub enterprise is on the road map if there isn't a way already and I would assume that on-prem Gitlab would be in the same vein. @casperdcl ?

But for alternate solutions, I would consider the feasibility of hosting an APT mirror and private npm registry alongside server_B that you can populate with the vetted packages or simply mirror all requested packages if that level of security isn't required. I might also consider using LXC those containers are very VM like and could be an easy way to install everything into it while on server_B to then move and run on server_A

dacbd avatar Sep 29 '21 15:09 dacbd

Thanks, @DanielBarnes, for ideas of alternate solutions! All of them should work. However, it seems much easier to work with GitLab Runners and pre-built Docker images with CML installed in such cases. :)

mnrozhkov avatar Sep 29 '21 16:09 mnrozhkov

Maybe we can ship the GitLab and GitHub self–hosted runner agents with the CML container images, so there is nothing to download/install.

Supersedes #615?

0x2b3bfa0 avatar Oct 01 '21 19:10 0x2b3bfa0

Shipping with Docker images would be great for this (#752).

#615 would include non-Docker cases so is different.

casperdcl avatar Oct 01 '21 23:10 casperdcl

actually we could solve this in one go downloading from local storage first file:// and then http:// if not found

DavidGOrtega avatar Oct 15 '21 11:10 DavidGOrtega

We will revisit this upon user request.

dacbd avatar Feb 17 '23 15:02 dacbd