ndeep27
ndeep27
@singhniraj08 Can you please help with above query? I want to see if there is a way to convert a TF graph to CUDA graph for serving optimization
Thanks @singhniraj08 For (1) when looking at the Tensorboard profile I dont see any operation happening on CPU. All operations are running on GPU.
Hi @BowenFu Is there an update on this issue? We are facing a similar issue
@sourabh-burnwal Even the latest version fails (24.07) without giving any specific error. For instance below is what I see in the log gmake[3]: Leaving directory `/tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build/grpc/src/grpc-build' cd /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build/grpc/src/grpc-build && /usr/local/bin/cmake...
@sourabh-burnwal Can you please help with above?
@sourabh-burnwal Is there a CPU version of triton with pytorch released in open source?
@sourabh-burnwal we do access it via model-config where we specify CPU but the issue is triton libraries like libtorch_cpu needs cudaart and other related cuda libraries which is leading for...
@sourabh-burnwal Can you send me the exact docker image which you used to run?
For instance I downloaded - nvcr.io/nvidia/tritonserver:24.07-pyt-python-py3 and when I ssh to the host and run the below, I see libraries like cudart linked root@031a384b7f38:/opt/tritonserver# ldd backends/pytorch/libtorch_cpu.so linux-vdso.so.1 (0x00007fff1d4cf000) libc10.so =>...
We also cannot use these docker images directly since our OS is a variant of RHEL. So we have to build triton from source for our OS.