enterprise_gateway icon indicating copy to clipboard operation
enterprise_gateway copied to clipboard

Importing custom python packages when launching kernels on Spark

Open amangarg96 opened this issue 6 years ago • 4 comments

I am using Jupyter Enterprise Gateway to run iPython kernels in YARN Cluster Mode on Apache Spark. The Jupyter Lab server is running on my local machine (Macbook), while the Jupyter Enterprise Gateway server is running on one of the Nodes of the cluster, while kernels are launched on the cluster.

Is there a way to import custom python packages which are made on the notebook server machine? For instance, if a user has a full project that he is working on, which contains some python packages that he has made. How does he import them?

amangarg96 avatar Nov 13 '18 09:11 amangarg96

At the moment, when running in YARN Cluster Mode, a package needs to be pip installed from a notebook cell, or available in all worker nodes. There are a few approaches to work around this issue, directory mountings such as NSF or object store, anaconda enterprise, or regular update scripts to sync locally installed environments to remote workers.

lresende avatar Nov 13 '18 18:11 lresende

pip install from notebook cell seems to be the most suitable solution for my use case, which should work if the python package is stored in HDFS (so that it's available to all worker nodes). This would be a temporary solution, as users of Notebooks should be kept away from interacting with HDFS.

I'm thinking of going with a ContentsManager implementation, like HDFSContents and S3Contents, using which the full project (along with the python packages) would be available in all worker nodes and user can conveniently do pip install

Are there any suggestions on this?

amangarg96 avatar Nov 14 '18 07:11 amangarg96

@amangarg96 - Any updates on this?

kevin-bates avatar Jan 05 '19 18:01 kevin-bates

@kevin-bates I have not found the time to do the proper solutioning for this, but one hacky way to do this from Notebooks is to use IPython's built-in magic commands.

User can send the custom package to the IPython kernel's container by using %%writefile, where the user can paste the contents of a file (from local file system) in a Notebook cell, with %%writefile magic command at top to create a file on the container.

Use system calls (!) to build the packages.

amangarg96 avatar Mar 19 '19 03:03 amangarg96