server icon indicating copy to clipboard operation
server copied to clipboard

Request for non-gpu version docker image (to decrease the image size)

Open SimZhou opened this issue 3 years ago • 13 comments

Hi,

I am requesting for a triton version which does not provide GPU support, here are reasons:

  1. (Main Reason 1) Docker images with gpu support are so large (10-30x that without gpu support), which increases both hardware expense and network transfer expense.
  2. (Main Reason 2) As as SAAS provider, our clients does not always provide servers with gpu. Either because gpus are expensive, or they do not need latency of that low. (And I am prettier sure there are large user cases for non-gpu version due to this reason.)
  3. Tensorflow serving has that, but I prefer triton, so I think it would be a good idea for triton to have that as well : )

Triton

TF Serving (TF Serving container without gpu support can be as smaller as 100MB, that is appealing.)

SimZhou avatar Feb 18 '22 08:02 SimZhou

Duplicate of #3913

narolski avatar Feb 18 '22 10:02 narolski

cc @jbkyang-nvi @CoderHam

Tabrizian avatar Feb 19 '22 00:02 Tabrizian

Hi @narolski we are still discussing this internally. Timeline TBD.

jbkyang-nvi avatar Feb 22 '22 22:02 jbkyang-nvi

@jbkyang-nvi thank you! I've managed to build the CPU-only image with onnxruntimebackend entirely in Docker (without relying on other Dockerfiles to copy data), so if there's any way I can help or contribute, please do let me know! 😃

narolski avatar Feb 25 '22 09:02 narolski

@jbkyang-nvi thank you! I've managed to build the CPU-only image with onnxruntimebackend entirely in Docker (without relying on other Dockerfiles to copy data), so if there's any way I can help or contribute, please do let me know! 😃

Hey, can I ask how's the size of your CPU-only image?

SimZhou avatar Feb 26 '22 14:02 SimZhou

@SimZhou the resulting Docker image (Triton binaries + OS dependencies) weights ~300 MB.

narolski avatar Mar 04 '22 21:03 narolski

@narolski hi, could you share your docker image or code to build the CPU-only image?

Jackiexiao avatar May 20 '22 04:05 Jackiexiao

The CPU-only image is highly needed,Timeline has been done?@jbkyang-nvi

lfxx avatar Jul 04 '22 02:07 lfxx

The CPU-only image is highly needed,Timeline has been done?@jbkyang-nvi

@lfxx hey can you elaborate on why you can't use build.py to build your CPU-only image? The instructions are here in build.md

Not sure what @narolski did when he built the container but I was able to get ~500MB with

python3 build.py --enable-logging --enable-stats --enable-metrics --endpoint=http --endpoint=grpc --image=base,ubuntu:20.04 --backend=onnxruntime:"main"

jbkyang-nvi avatar Jul 06 '22 23:07 jbkyang-nvi

Also just to clarify, we're still trying to internally determine prioritization :)

jbkyang-nvi avatar Jul 06 '22 23:07 jbkyang-nvi

The CPU-only image is highly needed,Timeline has been done?@jbkyang-nvi

@lfxx hey can you elaborate on why you can't use build.py to build your CPU-only image? The instructions are here in build.md

Not sure what @narolski did when he built the container but I was able to get ~500MB with

python3 build.py --enable-logging --enable-stats --enable-metrics --endpoint=http --endpoint=grpc --image=base,ubuntu:20.04 --backend=onnxruntime:"main"

I can build the cpu-only image on myself.But i still look forward to an offical release as it is more stable than building it on my own.

lfxx avatar Jul 11 '22 08:07 lfxx

I build a cpu-only image with onnx / python / ensemble for version r22.06

docker pull jackiexiao/tritonserver:22.06-onnx-py-cpu

build command:

git checkout r22.06
python3 build.py  \
    --enable-logging --enable-stats --enable-tracing --enable-metrics \
    --endpoint=http --endpoint=grpc \
    --backend=ensemble \
    --backend=python \
    --backend=onnxruntime
docker tag tritonserver:latest tritonserver:22.06-onnx-py-cpu

Jackiexiao avatar Sep 02 '22 03:09 Jackiexiao

@lfxx I've copied the Triton Server executable (along with its dependencies) from the image used to build the Triton Server to the image used as the runtime for Triton Server instance. This provided the additional size reduction.

narolski avatar Sep 06 '22 08:09 narolski

@lfxx I've copied the Triton Server executable (along with its dependencies) from the image used to build the Triton Server to the image used as the runtime for Triton Server instance. This provided the additional size reduction.

are you using multi-stage builds to reduce the size?

SimZhou avatar Feb 03 '23 06:02 SimZhou

docker pull jackiexiao/tritonserver:22.06-onnx-py-cpu

I haven't managed to build the image, but thank you for sharing yours...

espoirMur avatar Mar 26 '24 22:03 espoirMur