server
server copied to clipboard
Request for non-gpu version docker image (to decrease the image size)
Hi,
I am requesting for a triton version which does not provide GPU support, here are reasons:
- (Main Reason 1) Docker images with gpu support are so large (10-30x that without gpu support), which increases both hardware expense and network transfer expense.
- (Main Reason 2) As as SAAS provider, our clients does not always provide servers with gpu. Either because gpus are expensive, or they do not need latency of that low. (And I am prettier sure there are large user cases for non-gpu version due to this reason.)
- Tensorflow serving has that, but I prefer triton, so I think it would be a good idea for triton to have that as well : )
(TF Serving container without gpu support can be as smaller as 100MB, that is appealing.)
Duplicate of #3913
cc @jbkyang-nvi @CoderHam
Hi @narolski we are still discussing this internally. Timeline TBD.
@jbkyang-nvi thank you! I've managed to build the CPU-only image with onnxruntimebackend
entirely in Docker (without relying on other Dockerfiles to copy data), so if there's any way I can help or contribute, please do let me know! 😃
@jbkyang-nvi thank you! I've managed to build the CPU-only image with
onnxruntimebackend
entirely in Docker (without relying on other Dockerfiles to copy data), so if there's any way I can help or contribute, please do let me know! 😃
Hey, can I ask how's the size of your CPU-only image?
@SimZhou the resulting Docker image (Triton binaries + OS dependencies) weights ~300 MB.
@narolski hi, could you share your docker image or code to build the CPU-only image?
The CPU-only image is highly needed,Timeline has been done?@jbkyang-nvi
The CPU-only image is highly needed,Timeline has been done?@jbkyang-nvi
@lfxx hey can you elaborate on why you can't use build.py
to build your CPU-only image? The instructions are here in build.md
Not sure what @narolski did when he built the container but I was able to get ~500MB with
python3 build.py --enable-logging --enable-stats --enable-metrics --endpoint=http --endpoint=grpc --image=base,ubuntu:20.04 --backend=onnxruntime:"main"
Also just to clarify, we're still trying to internally determine prioritization :)
The CPU-only image is highly needed,Timeline has been done?@jbkyang-nvi
@lfxx hey can you elaborate on why you can't use
build.py
to build your CPU-only image? The instructions are here in build.mdNot sure what @narolski did when he built the container but I was able to get ~500MB with
python3 build.py --enable-logging --enable-stats --enable-metrics --endpoint=http --endpoint=grpc --image=base,ubuntu:20.04 --backend=onnxruntime:"main"
I can build the cpu-only image on myself.But i still look forward to an offical release as it is more stable than building it on my own.
I build a cpu-only image with onnx / python / ensemble for version r22.06
docker pull jackiexiao/tritonserver:22.06-onnx-py-cpu
build command:
git checkout r22.06
python3 build.py \
--enable-logging --enable-stats --enable-tracing --enable-metrics \
--endpoint=http --endpoint=grpc \
--backend=ensemble \
--backend=python \
--backend=onnxruntime
docker tag tritonserver:latest tritonserver:22.06-onnx-py-cpu
@lfxx I've copied the Triton Server executable (along with its dependencies) from the image used to build the Triton Server to the image used as the runtime for Triton Server instance. This provided the additional size reduction.
@lfxx I've copied the Triton Server executable (along with its dependencies) from the image used to build the Triton Server to the image used as the runtime for Triton Server instance. This provided the additional size reduction.
are you using multi-stage builds to reduce the size?
docker pull jackiexiao/tritonserver:22.06-onnx-py-cpu
I haven't managed to build the image, but thank you for sharing yours...