deep-learning-containers
deep-learning-containers copied to clipboard
[pending-change] Triton Inference Server Image Not Shown (sagemaker-tritonserver:21.08-py3)
Checklist
- [X] I've prepended issue tag with type of change: [feature]
- [X] (If applicable) I've documented below the DLC image/dockerfile this relates to
- [X] (If applicable) I've documented below the test files this relates to
Concise Description:
Documentation on the available triton server images, as well as how they are created.
DLC image/dockerfile:
sagemaker-tritonserver:21.08-py3
Additional context
I have followed examples and read the blog post, but there is no clear definition of the triton server dockerfiles or how they are built.
https://docs.aws.amazon.com/sagemaker/latest/dg/triton.html
it will be helpful to show how these triton images are created (the dockerfiles). not all ml models can utilize these images out of the box so people might need to build their own version of triton image. please consider adding related docs.
this is still the case. Any reason why this image seems to be the only one not shown? I need it to build my own image and wanted to reference it instead of reading through the docs of BYOC and trial and error😭 Come on aws make my life easier just once 😞
The SageMaker Triton container is built using the source code from here - https://github.com/triton-inference-server/server/blob/main/src/sagemaker_server.cc and the entrypoint script used in the container is in this folder https://github.com/triton-inference-server/server/tree/main/docker/sagemaker. You can extend these containers https://github.com/aws/deep-learning-containers/blob/master/available_images.md#nvidia-triton-inference-containers-sm-support-only if you want to customize it.
@nskool FYA
All changes for SageMaker are upstreamed to Triton's Github Repo - https://github.com/triton-inference-server/server, and so -
The SM-Triton image is essentially the same image as NGC container, with the following backends enabled, and the 'SageMaker' endpoint enabled. The image is pushed to different ECR repos, for SageMaker to easily pull them for use.
./build.py --enable-logging --enable-stats --enable-tracing --enable-metrics --enable-gpu-metrics --enable-gpu --no-container-interactive --endpoint=http --endpoint=grpc --endpoint=sagemaker --repo-tag=common:$RELEASE_TAG --repo-tag=core:$RELEASE_TAG --repo-tag=backend:$RELEASE_TAG --repo-tag=thirdparty:$RELEASE_TAG --backend=ensemble:$RELEASE_TAG --backend=tensorrt:$RELEASE_TAG --backend=identity:$RELEASE_TAG --backend=repeat:$RELEASE_TAG --backend=square:$RELEASE_TAG --backend=onnxruntime:$RELEASE_TAG --backend=pytorch:$RELEASE_TAG --backend=tensorflow1:$RELEASE_TAG --backend=tensorflow2:$RELEASE_TAG --backend=python:$RELEASE_TAG --backend=dali:$RELEASE_TAG --backend=fil:$RELEASE_TAG --backend=fastertransformer:$RELEASE_TAG --repoagent=checksum:$RELEASE_TAG
The build.py script creates a container on-the-fly, and so there isn't a dockerfile as such.
Finally, the exact command that starts tritonserver is part of the /usr/bin/serve script here - https://github.com/triton-inference-server/server/blob/c7254d33270c547d54ec92c4b593bb9777da368b/docker/sagemaker/serve#L137
Thanks for the quick reply~ I see build.py is what is creating the Dockerfile.
I had found https://github.com/triton-inference-server/server/blob/c7254d33270c547d54ec92c4b593bb9777da368b/docker/sagemaker/serve#L137 before & realized sagemaker was using --allow-http=false
. so now I realize--allow-sagemaker=true
was what was making it confirm to the sagemaker container standart of respond to /invocations and /ping on port 8080
. Although in my experience sagemaker request handling felt like it gave worst performance than the http (will need to test it further)
I think I can simply build my image on top of the NGC container
+the stuff from https://github.com/triton-inference-server/server/blob/c7254d33270c547d54ec92c4b593bb9777da368b/build.py#L1049
+launching with --allow-http=false --allow-sagemaker=true
+using --model-repository /opt/ml/model
Closing this due to the issue being addressed. For an overview of the Triton Image build, please refer the above comment - https://github.com/aws/deep-learning-containers/issues/1557#issuecomment-1551088683
All changes for SageMaker are upstreamed to Triton's Github Repo - https://github.com/triton-inference-server/server, and so -
The SM-Triton image is essentially the same image as NGC container, with the following backends enabled, and the 'SageMaker' endpoint enabled. The image is pushed to different ECR repos, for SageMaker to easily pull them for use.
./build.py --enable-logging --enable-stats --enable-tracing --enable-metrics --enable-gpu-metrics --enable-gpu --no-container-interactive --endpoint=http --endpoint=grpc --endpoint=sagemaker --repo-tag=common:$RELEASE_TAG --repo-tag=core:$RELEASE_TAG --repo-tag=backend:$RELEASE_TAG --repo-tag=thirdparty:$RELEASE_TAG --backend=ensemble:$RELEASE_TAG --backend=tensorrt:$RELEASE_TAG --backend=identity:$RELEASE_TAG --backend=repeat:$RELEASE_TAG --backend=square:$RELEASE_TAG --backend=onnxruntime:$RELEASE_TAG --backend=pytorch:$RELEASE_TAG --backend=tensorflow1:$RELEASE_TAG --backend=tensorflow2:$RELEASE_TAG --backend=python:$RELEASE_TAG --backend=dali:$RELEASE_TAG --backend=fil:$RELEASE_TAG --backend=fastertransformer:$RELEASE_TAG --repoagent=checksum:$RELEASE_TAG
The build.py script creates a container on-the-fly, and so there isn't a dockerfile as such.
Finally, the exact command that starts tritonserver is part of the /usr/bin/serve script here - https://github.com/triton-inference-server/server/blob/c7254d33270c547d54ec92c4b593bb9777da368b/docker/sagemaker/serve#L137
@nskool it still would be helpful to release the Dockerfile used to build the SageMaker Triton DLC images. I'm running into a scenario where I need to set up a custom cloudwatch agent on my customized container to pipe Triton's prometheus metrics into cloudwatch - because AWS is slower in releasing images than Triton, we are stuck with Python 3.8 because it's what's currently supported by AWS! Hence, the need for customization.