amazon-sagemaker-examples
amazon-sagemaker-examples copied to clipboard
Error hosting endpoint knn-dense-5-1: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..
I build a dockerfile for knn and pushed it to ECR. I am trying to create a Sagemaker endpoint. There is an error while I try to create the endpoint. It says "UnexpectedStatusException: Error hosting endpoint knn-dense-5-1: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..".
When I checked Cloudwatch, I saw the message
"*** Serving Flask app "serve" (lazy loading)
- Environment: production WARNING: Do not use the development server in a production environment. Use a production WSGI server instead.
- Debug mode: off
- Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)**"
The docker file code is as follows: #!/bin/sh FROM nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04
ARG PYTHON_VERSION=3.6
RUN apt-get update &&
apt-get -y install --no-install-recommends
build-essential
cmake
vim
ca-certificates
python3-dev
curl
wget
jq
git
nginx
libjpeg-dev
libpng-dev
RUN rm -rf /var/lib/apt/lists/*
#RUN curl -o https://bootstrap.pypa.io/get-pip.py
RUN curl -o ~/miniconda.sh -O https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh &&
chmod +x ~/miniconda.sh &&
~/miniconda.sh -b -p /opt/conda &&
rm ~/miniconda.sh &&
/opt/conda/bin/conda install -y python=$PYTHON_VERSION &&
/opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/bin:$PATH
RUN pip install --no-cache-dir flask==1.0.2 RUN pip install --no-cache-dir requests RUN pip install --no-cache-dir numpy==1.14.6 RUN pip install --no-cache-dir Jinja2==2.10 RUN pip install --no-cache-dir Werkzeug RUN pip install --no-cache-dir boto3 RUN pip install --no-cache-dir scipy==1.1.0
RUN pip install --no-cache-dir git+https://github.com/nmslib/hnswlib#subdirectory=python_bindings
RUN pip install scipy==1.2.1 scikit-learn==0.20.2 pandas==0.24.2 flask gevent gunicorn
RUN pip freeze
RUN python3 -V
COPY knn/serve /usr/local/bin COPY knn/train /usr/local/bin
EXPOSE 8080
Please help me resolve the error
hey were you able to solve this?