Kris Hung comments

Results 129 comments of


                                            Kris Hung

trafficstars

Triton server memory accumulation problem

Hi @yoo-wonjun, regarding > When the problem request is completed, I check nvidia-smi and the memory is 5680, so I repeat the request and the memory is 5680. I was...

Triton server memory accumulation problem

@yoo-wonjun Thanks for the explanation. > If it depends on the framework you mentioned, does this mean that it may be a problem that occurs when using tensorrt? I mean...

Triton server memory accumulation problem

Closing due to lack of activity. Please re-open the issue if you would like to follow up with this issue.

Triton Server OpenVINO backend not working with Tensorflow saved models

@tanmayv25 for vis.

The trt llm container does not have the other backends

Hi @MatthieuToulemont, the Triton TRT-LLM container is a special container that only contains TRT-LLM backend and Python backend. If you'd like to have other backends, you could try with either...

The trt llm container does not have the other backends

No, the Python Backend should be the same.

The trt llm container does not have the other backends

@tricky61 The `nvcr.io/nvidia/tritonserver:24.05-py3` container contains ONNX, TRT and PyTorch backends. The `nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3` only has TRTLLM and Python backends.

The trt llm container does not have the other backends

@tricky61 It shouldn't make any difference. Note that you'd have to `pip install vllm` and make sure `model.py` exists under `/opt/tritonserver/backends/vllm_backend`.

Add composite metrics for kubernetes inference gateway metrics protocol

@kaiyux Could you advise what would be the approach for external contribution here?

Add composite metrics for kubernetes inference gateway metrics protocol

Thanks @kaiyux! I can help with integrating to the internal repo once the changes are finalized. What steps need to be taken to properly credit the contributor?