Jacky
Jacky
Hi, > A quick solution I found was to not use the explicit mode as Triton will then attempt to load all models from the repository. Does `--model-control-mode=explicit --load-model=*` help?...
Thanks for reporting the issue, I have filed a ticket for us to investigate further. > In any case I'll be happy to provide the necessary fixes. Any contribution is...
Hi @mutkach, I took a deeper look into the Python AsyncIO client and seems like we already have decompression built in. When calling the `async infer()`, it will: 1. [read...
Hi @lawliet0823, is the segmentation fault happening every time you run the code? Can you use some tools (i.e. gdb) to print the stack trace when the segmentation fault happens?
I wonder if you still get the segmentation fault if you run the [ensemble_image_client.cc](https://github.com/triton-inference-server/client/blob/main/src/c%2B%2B/examples/ensemble_image_client.cc#L179-L199) directly via command line? If not, maybe there is a difference in the inputs passed into...
Hi @Timen, does setting the [Model WarmUp](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#model-warmup) on all composing models not helped? The ensemble model only passes inputs and outputs between composing models and does not do any real...
@GuanLuo @nnshah1 do you want to file a ticket for adding warmup config into ensemble model config?
> Do you know whether this is the expected the behaviour? I think cancellation should be propagated to the composing model requests as well. This is expected at this time,...
Hi @blackhu, > docker run -it --rm --gpus all --net=host nvcr.io/nvidia/tritonserver:23.11-py3-igpu-sdk > docker run -it --rm --gpus all --net=host nvcr.io/nvidia/tritonserver:23.11-py3-sdk I am wondering if you meant to download `nvcr.io/nvidia/tritonserver:23.11-py3-igpu` container...
Thanks for suggesting improvements. I have added a ticket for us to investigate further.