clipper Support deploying models with GPU access

For Kubernetes, we can use the experimental GPU support feature: https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/

For Docker, we can use nvidia-docker.

Nov 29 '17 19:11 dcrankshaw

@dcrankshaw i have done some work on this ...i would like to pick this up if its fine with you guys

Jan 31 '18 23:01 santi81

Sure that would be great. Have you worked with the Kubernetes GPU support? Go ahead and assign the issue to yourself.

Feb 05 '18 00:02 dcrankshaw

I've been paying rather close attention to this issue, so I'm just wondering if there's been any behind-the-scenes movement on it? Feels like a major value add for clipper.

Mar 11 '18 03:03 lgendrot

I just implemented this. It still needs a bit of testing, but I should have a PR up by the end of the week.

On Sat, Mar 10, 2018 at 7:17 PM, Luc Gendrot [email protected] wrote:

I've been paying rather close attention to this issue, so I'm just wondering if there's been any behind-the-scenes movement on it? Feels like a major value add for clipper.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ucbrise/clipper/issues/338#issuecomment-372086125, or mute the thread https://github.com/notifications/unsubscribe-auth/AAaV5JDwcIqyhK4qhVVGZUjizvNpdNVtks5tdJdAgaJpZM4Qvhcd .

Mar 21 '18 02:03 dcrankshaw

Hi @dcrankshaw, I have tried by using nvidia-docker. I have installed nvidia-docker package in my local machine and start the docker where model servers using nvidia-docker container_id to access the gpu resources from the machine. But the model-server doesn't get gpu access.

Aug 13 '18 06:08 robi56

For the latest nvidia-docker I believe you need to pass in runtime=“nvidia” in docker.containers.run

Aug 13 '18 06:08 simon-mo

After weeks of researching and trial-and-error, I have finally got the Clipper to work with GPU and TensorFlow. I think it is worth to share my little experience with those we are also looking at this issue. I would try my very best to make the steps clear and concise, as summarized as follows:

1. In order to allow the GPU support for Clipper, you would first need to install nvidia docker Detailed steps you are advised to refer to: https://github.com/NVIDIA/nvidia-docker
2. Build your own nvidia docker image, which will be served as a base image when you build and deploy your clipper. I have referred to the following: 
    ◦ https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/base/Dockerfile
    ◦ https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/runtime/Dockerfile
	and construct my own docker file to build a nvidia docker image with cuda runtime. Please 		do not mind to overwrite the default PATH and LD_LIBRARY_PATH which I observed 		that they were not pointing to the right folders. Instead, use the following values in case:

ENV PATH /usr/local/cuda:${PATH} ENV LD_LIBRARY_PATH /usr/local/cuda/lib64

	Also, you are expected to install the following packages:
• Python3
• python3-pip
• libzmq5
• redis-server
• libsodium18
• build-essential

	and also the python packages:
• cloudpickle
• pyzmq
• prometheus_client
• pyyaml
• jsonschema
• redis
• psutil
• flask
• numpy

3. Next, please ensure you have also installed the cuDNN (please refer to: https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html). In my case, as I already have those required files in my host machine, what I need to do is just to copy the files over to the docker image. 
4. Make a first-tier directory in your docker image, and name it as container
5. Copy the following files from your host to the /container in docker image:
                                   **COPY containers/python/__init__.py containers/python/tf_container.py containers/python/container_entry.sh containers/python/rpc.py /container/
                                   COPY monitoring/metrics_config.yaml /container/**

In case you doubt where to get those files, here is the link: https://github.com/ucbrise/clipper/tree/develop/containers/python Next, make some minor revision to rpc.py at line 757: From: cmd = ['python', '-m', 'clipper_admin.metrics.server'] To: cmd = ['python3', '-m', 'clipper_admin.metrics.server']

6. Upgrade pip3 to the newer version: 
   		RUN pip3 install --upgrade pip
7. Install tensorflow-gpu and clipper_admin
8. Set the following:

**ENV CLIPPER_MODEL_PATH=/model

CMD ["/container/container_entry.sh", "tensorflow-container", "/container/tf_container.py"] HEALTHCHECK --interval=3s --timeout=3s --retries=1 CMD test -f /model_is_ready.check || exit 1**

Note: the HEALTHCHECK statement is important as the clipper_admin would need such information when starting your model.

9. Modify the /etc/docker/daemon.json by adding the following entry:
	**"default-runtime":"nvidia",**

and then restart the docker service to make the above configuration effective.

Now, you are ready to kick-start the Clipper with GPU support. Hope that the aforementioned steps are useful to you all. I have also provided a docker template here: https://github.com/cwtan501/nvidia_tf_template

Aug 24 '18 08:08 cwtan501

Hi @cwtan501 I tried to run your template from aws p2 instance but it failed to build docker image at

Step 22/35 : RUN apt-get update && apt-get install -y --no-install-recommends cuda-libraries-$CUDA_PKG_VERSION cuda-cublas-9-0=9.0.176.4-1 libnccl2=$NCCL_VERSION-1+cuda9.0 && apt-mark hold libnccl2 && rm -rf /var/lib/apt/lists/* ---> Using cache ---> b272447bbe1b Step 23/35 : RUN mkdir -p /usr/local/cuda/include ---> Using cache ---> 7ab576eadefb Step 24/35 : COPY /cuda/include/* /usr/local/cuda/include/ COPY failed: no source files were specified

Feb 08 '19 18:02 wcwang07

Now nvidia provides cuda docker images, we can try:

FROM nvidia/cuda:9.2-cudnn7-runtime

# alias python3 -> python
RUN echo '#!/bin/bash\npython3 "$@"' > /usr/bin/python && \
    chmod +x /usr/bin/python

# install python dependencies
RUN pip3 install cloudpickle==0.5.* pyzmq==17.0.* requests==2.18.* scikit-learn==0.19.* \
  numpy==1.14.* pyyaml==3.12.* docker==3.1.* kubernetes==5.0.* tensorflow==1.6.* mxnet==1.1.* pyspark==2.3.* \
  xgboost==0.7.*

# install binary dependencies
RUN mkdir -p /model \
      && apt-get update -qq \
      && apt-get install -y -qq libzmq5 libzmq5-dev redis-server libsodium18 build-essential

# make sure you run this inside clipper directory
COPY clipper_admin /clipper_admin/

RUN cd /clipper_admin \
    && pip install -q .

WORKDIR /container

COPY containers/python/__init__.py containers/python/rpc.py /container/

COPY monitoring/metrics_config.yaml /container/

ENV CLIPPER_MODEL_PATH=/model

HEALTHCHECK --interval=3s --timeout=3s --retries=1 CMD test -f /model_is_ready.check || exit 1

RUN pip install -q tensorflow==1.6.*

COPY containers/python/tf_container.py containers/python/container_entry.sh /container/

CMD ["/container/container_entry.sh", "tensorflow-container", "/container/tf_container.py"]

Make sure you run docker build inside clipper directory, just git clone should do.

Feb 08 '19 21:02 simon-mo

Hi @wcwang07 ! Clipper is adding native support for PyTorch and TF on CUDA 10! I've made a PR adding support for PyTorch + CUDA 10 on docker, and will be rolling out TF support soon. This can be run on an AWS p2 instance. Make sure to choose the Deep Learning AMI (Ubuntu) Version 21

Feb 09 '19 06:02 RehanSD

@simon-mo @RehanSD I was using FROM tensorflow/tensorflow:latest-gpu-py3 this image seems to resolve issue with finding GPU:0 device

Feb 13 '19 04:02 wcwang07

@simon-mo ran this new gpu container with following stats:

recv: 0.000223 s, parse: 0.000013 s, handle: 0.157390 s

check it out at

docker pull wcwang07/test-gpu-container

clipper_conn.register_application(name="hello-tf", input_type="int", default_output="this is default output", slo_micros=3000000)

https://gist.github.com/wcwang07/aef2d54c134f7c43e726bf9d027770c9

python_deployer.deploy_tensorflow_model(clipper_conn=clipper_conn, name="tf-mobilnet", version=1, input_type="int", func=predict, tf_sess_or_saved_model_path='***',base_image='test-gpu-container',pkgs_to_install=['pillow'])

clipper_conn.link_model_to_app(app_name="hello-tf", model_name="tf-mobilnet")

Feb 13 '19 05:02 wcwang07

Docker is addressed in this PR #669

Apr 10 '19 22:04 RehanSD

clipper clipper copied to clipboard

Support deploying models with GPU access

clipper
clipper copied to clipboard