deep-learning-containers icon indicating copy to clipboard operation
deep-learning-containers copied to clipboard

[pending-change] Triton Inference Server Image Not Shown (sagemaker-tritonserver:21.08-py3)

Open Csinclair0 opened this issue 3 years ago • 1 comments

Checklist

  • [X] I've prepended issue tag with type of change: [feature]
  • [X] (If applicable) I've documented below the DLC image/dockerfile this relates to
  • [X] (If applicable) I've documented below the test files this relates to

Concise Description:

Documentation on the available triton server images, as well as how they are created.

DLC image/dockerfile:

sagemaker-tritonserver:21.08-py3

Additional context

I have followed examples and read the blog post, but there is no clear definition of the triton server dockerfiles or how they are built.

https://docs.aws.amazon.com/sagemaker/latest/dg/triton.html

Csinclair0 avatar Nov 30 '21 17:11 Csinclair0

it will be helpful to show how these triton images are created (the dockerfiles). not all ml models can utilize these images out of the box so people might need to build their own version of triton image. please consider adding related docs.

n0thing233 avatar Aug 05 '22 23:08 n0thing233

this is still the case. Any reason why this image seems to be the only one not shown? I need it to build my own image and wanted to reference it instead of reading through the docs of BYOC and trial and error😭 Come on aws make my life easier just once 😞

cceyda avatar May 17 '23 04:05 cceyda

The SageMaker Triton container is built using the source code from here - https://github.com/triton-inference-server/server/blob/main/src/sagemaker_server.cc and the entrypoint script used in the container is in this folder https://github.com/triton-inference-server/server/tree/main/docker/sagemaker. You can extend these containers https://github.com/aws/deep-learning-containers/blob/master/available_images.md#nvidia-triton-inference-containers-sm-support-only if you want to customize it.

dhawalkp avatar May 17 '23 05:05 dhawalkp

@nskool FYA

dhawalkp avatar May 17 '23 05:05 dhawalkp

All changes for SageMaker are upstreamed to Triton's Github Repo - https://github.com/triton-inference-server/server, and so -

The SM-Triton image is essentially the same image as NGC container, with the following backends enabled, and the 'SageMaker' endpoint enabled. The image is pushed to different ECR repos, for SageMaker to easily pull them for use.

./build.py --enable-logging --enable-stats --enable-tracing --enable-metrics --enable-gpu-metrics --enable-gpu --no-container-interactive --endpoint=http --endpoint=grpc --endpoint=sagemaker --repo-tag=common:$RELEASE_TAG --repo-tag=core:$RELEASE_TAG --repo-tag=backend:$RELEASE_TAG --repo-tag=thirdparty:$RELEASE_TAG --backend=ensemble:$RELEASE_TAG --backend=tensorrt:$RELEASE_TAG --backend=identity:$RELEASE_TAG --backend=repeat:$RELEASE_TAG --backend=square:$RELEASE_TAG --backend=onnxruntime:$RELEASE_TAG --backend=pytorch:$RELEASE_TAG --backend=tensorflow1:$RELEASE_TAG --backend=tensorflow2:$RELEASE_TAG --backend=python:$RELEASE_TAG --backend=dali:$RELEASE_TAG --backend=fil:$RELEASE_TAG --backend=fastertransformer:$RELEASE_TAG --repoagent=checksum:$RELEASE_TAG

The build.py script creates a container on-the-fly, and so there isn't a dockerfile as such.

Finally, the exact command that starts tritonserver is part of the /usr/bin/serve script here - https://github.com/triton-inference-server/server/blob/c7254d33270c547d54ec92c4b593bb9777da368b/docker/sagemaker/serve#L137

nikhil-sk avatar May 17 '23 09:05 nikhil-sk

Thanks for the quick reply~ I see build.py is what is creating the Dockerfile. I had found https://github.com/triton-inference-server/server/blob/c7254d33270c547d54ec92c4b593bb9777da368b/docker/sagemaker/serve#L137 before & realized sagemaker was using --allow-http=false. so now I realize--allow-sagemaker=true was what was making it confirm to the sagemaker container standart of respond to /invocations and /ping on port 8080. Although in my experience sagemaker request handling felt like it gave worst performance than the http (will need to test it further)

I think I can simply build my image on top of the NGC container +the stuff from https://github.com/triton-inference-server/server/blob/c7254d33270c547d54ec92c4b593bb9777da368b/build.py#L1049 +launching with --allow-http=false --allow-sagemaker=true +using --model-repository /opt/ml/model

cceyda avatar May 17 '23 11:05 cceyda

Closing this due to the issue being addressed. For an overview of the Triton Image build, please refer the above comment - https://github.com/aws/deep-learning-containers/issues/1557#issuecomment-1551088683

nikhil-sk avatar Mar 01 '24 22:03 nikhil-sk

All changes for SageMaker are upstreamed to Triton's Github Repo - https://github.com/triton-inference-server/server, and so -

The SM-Triton image is essentially the same image as NGC container, with the following backends enabled, and the 'SageMaker' endpoint enabled. The image is pushed to different ECR repos, for SageMaker to easily pull them for use.

./build.py --enable-logging --enable-stats --enable-tracing --enable-metrics --enable-gpu-metrics --enable-gpu --no-container-interactive --endpoint=http --endpoint=grpc --endpoint=sagemaker --repo-tag=common:$RELEASE_TAG --repo-tag=core:$RELEASE_TAG --repo-tag=backend:$RELEASE_TAG --repo-tag=thirdparty:$RELEASE_TAG --backend=ensemble:$RELEASE_TAG --backend=tensorrt:$RELEASE_TAG --backend=identity:$RELEASE_TAG --backend=repeat:$RELEASE_TAG --backend=square:$RELEASE_TAG --backend=onnxruntime:$RELEASE_TAG --backend=pytorch:$RELEASE_TAG --backend=tensorflow1:$RELEASE_TAG --backend=tensorflow2:$RELEASE_TAG --backend=python:$RELEASE_TAG --backend=dali:$RELEASE_TAG --backend=fil:$RELEASE_TAG --backend=fastertransformer:$RELEASE_TAG --repoagent=checksum:$RELEASE_TAG

The build.py script creates a container on-the-fly, and so there isn't a dockerfile as such.

Finally, the exact command that starts tritonserver is part of the /usr/bin/serve script here - https://github.com/triton-inference-server/server/blob/c7254d33270c547d54ec92c4b593bb9777da368b/docker/sagemaker/serve#L137

@nskool it still would be helpful to release the Dockerfile used to build the SageMaker Triton DLC images. I'm running into a scenario where I need to set up a custom cloudwatch agent on my customized container to pipe Triton's prometheus metrics into cloudwatch - because AWS is slower in releasing images than Triton, we are stuck with Python 3.8 because it's what's currently supported by AWS! Hence, the need for customization.

jadhosn avatar Apr 19 '24 03:04 jadhosn