mlx
mlx copied to clipboard
Create a Docker image to run notebooks
Currently we use tensorflow/tensorflow:latest or (tensorflow/tensorflow:2.7.0 and tensorflow/tensorflow:2.3.0) to run our notebooks inside a Kubeflow Pipeline.
However, that image is very large and laden with many dependencies, many of which are not required for the respective notebooks. Due to those many dependencies, apk packages, binaries, and Python packages, the Docker image frequently fails to install the required notebook dependencies on top of it. At the moment only 4 of our 8 sample notebooks can run.
We need to find a basic Python Docker image and install only the necessary requirements like papermill.
To show (some of) the steps required to run a notebook on Kubernetes, take a look at this script from the katalog repo runs notebooks outside of a cluster:
https://github.com/machine-learning-exchange/katalog/blob/7fcd5ce/tools/bash/run_notebooks.sh#L58-L65
# TODO: find a smaller Docker image
IMAGE="tensorflow/tensorflow:latest"
docker run -i --rm --entrypoint "" "${IMAGE}" bash -c "
# download the notebook
wget -q -O notebook_in.ipynb '${NOTEBOOK_URL}' 2> /dev/null || curl -s -o notebook_in.ipynb '${NOTEBOOK_URL}'
# update pip
python3 -m pip install pip --upgrade --quiet --progress-bar=ascii
# install Elyra requirements, may not all be required beyond "papermill"
python3 -m pip install -r https://raw.githubusercontent.com/elyra-ai/elyra/master/etc/generic/requirements-elyra.txt --quiet --progress-bar on
# if the notebook has requirements, install those
[[ -n '${REQUIREMENTS}' ]] && python3 -m pip install ${REQUIREMENTS} --quiet --progress-bar=on
# show the installed package
python3 -m pip list
# run the notebook with papermill
papermill --log-level CRITICAL --report-mode notebook_in.ipynb notebook_out.ipynb
" >> "${LOG_FILE}" 2>&1 && echo OK || echo FAILED
Some Considerations:
- If we use a generic Docker image like
python:3.9then the pip install steps for theelyra-airequirements have to be repeated every time a notebook is run - If we create a custom notebook image, or maybe several most of the pip install steps are done at the time the Docker image is built, speeding up actual notebook execution
- Although the Docker image will be bigger this way, once it has been pulled onto the cluster, it should get cached.
- The same is not true for previously downloaded Python packages inside the container running the notebook.
- And generally the increased time for downloading a bigger Docker image is a fraction of the increased time required to download pip packages and the time pip needs on top of that to resolve potential version conflicts.
- We could use several specialized images for notebooks that have similar dependencies:
- ART+AIF360
- CodeNet
- Quantum/Qiskit
Additional Information:
Also see this notebook runner component with sample pipeline in KFP: