amazon-sagemaker-examples icon indicating copy to clipboard operation
amazon-sagemaker-examples copied to clipboard

Error hosting endpoint knn-dense-5-1: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

Open Deepi20 opened this issue 4 years ago • 1 comments

I build a dockerfile for knn and pushed it to ECR. I am trying to create a Sagemaker endpoint. There is an error while I try to create the endpoint. It says "UnexpectedStatusException: Error hosting endpoint knn-dense-5-1: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..".

When I checked Cloudwatch, I saw the message
"*** Serving Flask app "serve" (lazy loading)

  • Environment: production WARNING: Do not use the development server in a production environment. Use a production WSGI server instead.
  • Debug mode: off
  • Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)**"

The docker file code is as follows: #!/bin/sh FROM nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

ARG PYTHON_VERSION=3.6 RUN apt-get update &&
apt-get -y install --no-install-recommends
build-essential
cmake
vim
ca-certificates
python3-dev
curl
wget
jq
git
nginx
libjpeg-dev
libpng-dev RUN rm -rf /var/lib/apt/lists/* #RUN curl -o https://bootstrap.pypa.io/get-pip.py

RUN curl -o ~/miniconda.sh -O https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh &&
chmod +x ~/miniconda.sh &&
~/miniconda.sh -b -p /opt/conda &&
rm ~/miniconda.sh &&
/opt/conda/bin/conda install -y python=$PYTHON_VERSION &&
/opt/conda/bin/conda clean -ya

ENV PATH /opt/conda/bin:$PATH

RUN pip install --no-cache-dir flask==1.0.2 RUN pip install --no-cache-dir requests RUN pip install --no-cache-dir numpy==1.14.6 RUN pip install --no-cache-dir Jinja2==2.10 RUN pip install --no-cache-dir Werkzeug RUN pip install --no-cache-dir boto3 RUN pip install --no-cache-dir scipy==1.1.0

RUN pip install --no-cache-dir git+https://github.com/nmslib/hnswlib#subdirectory=python_bindings

RUN pip install scipy==1.2.1 scikit-learn==0.20.2 pandas==0.24.2 flask gevent gunicorn

RUN pip freeze

RUN python3 -V

COPY knn/serve /usr/local/bin COPY knn/train /usr/local/bin

EXPOSE 8080

Please help me resolve the error

Deepi20 avatar Mar 25 '20 20:03 Deepi20

hey were you able to solve this?

shrutishrestha avatar Jul 25 '22 19:07 shrutishrestha