jupyterhub-on-hadoop icon indicating copy to clipboard operation
jupyterhub-on-hadoop copied to clipboard

Manual Installation in air gap environments

Open hussainsultan opened this issue 5 years ago • 5 comments

The documentation assumes that the cluster can access public internet. This may not be the case in practice. I am not sure if the air-gap installation is in scope for this, but I thought I'd flag it here.

hussainsultan avatar Apr 23 '19 01:04 hussainsultan

How do people normally handle this? Searching cloudera's documentation I also couldn't find anything about air gap installs.

jcrist avatar Apr 23 '19 16:04 jcrist

CDS version of airgap documentation installation here: https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_install.html

CSD-based installs in an airgapped environment, put the Cloudera Data Science Workbench parcel into a new hosted or local parcel repository, and then configure the Cloudera Manager Server to target this newly-created repository.

Could this be done by targeting a local conda repository with required packages?

hussainsultan avatar Apr 25 '19 14:04 hussainsultan

It could. Or we could build RPMs (#8), or use conda-pack to package the environment for transport. There's lots of things that could work, I'm just not sure what's best.

jcrist avatar Apr 25 '19 14:04 jcrist

@hussainsultan would creating a parcel solve help solve the software distribution problem?

If that is the case, the easiest way I can find to create one is by using conda-pack. Let me hack something really quick and post it back here.

sodre avatar Apr 26 '19 02:04 sodre

@sodre creating a parcel will solve this issue for Cloudera managed Hadoop clusters and I am not sure thats the most general answer as @jcrist mentioned. Perhaps, the best answer might be just to document one of the ways for offline install e.g. using conda-pack to create a tarball and pushing it to edge node etc.

apologies for the delay.

hussainsultan avatar May 07 '19 19:05 hussainsultan