Ryan McCormick comments

Results 160 comments of


                                            Ryan McCormick

How to set the parameter make concurrent model execution?

Hi @Will-Chou-5722, I think your observations look correct. The TensorRT Backend specifically is unique in that it uses one thread for multiple model instances on the same GPU, whereas most...

gRPC Segfaults in Triton 24.06 due to Low Request Cancellation Timeout

Hi @AshwinAmbal, can you reproduce this issue with the 24.06 release? I think this change from @oandreeva-nv may possibly help the issue you're observing: https://github.com/triton-inference-server/server/pull/7325.

Histogram Metric for multi-instance tail latency aggregation

Hi @AshwinAmbal, thanks for adding a ticket for tracking! CC @yinggeh @harryskim @statiraju for viz

A Confusion about prefetch

Hi @SunnyGhj, thanks for filing an issue. Do you have any experiments or data showing this as a bottleneck? And have you tried modifying the code to see if it...

Triton Rust Crate as In-Process Inference Engine

CC @ryanolson

Triton Rust Crate as In-Process Inference Engine

Hi @asamadiya, while we don't have an official example at this time, I see there are some open-source projects that aimed to do this. One such example is here, which...

Triton crashes with SIGSEGV (signal 11)

Hi @JindrichD, thanks for sharing such a detailed issue. Can you try to reproduce this on the 24.07 release? There were recently some changes to how responses are written for...

Python backend SHM memory leak

Hi @mbahri, Do you have a minimal model, client, and steps you could share for reproducing to help expedite debugging? If it is a generic python backend shm issue, then...

support auto padding for tensorflow_backend

Hi @LinGeLin, Can you please provide more details on the use case, as well as an example model+client to reproduce the current lack of support and show the bottlenecks?

Is inferencing natively with C++ natively supported in Triton For Windows version 2.47 and ONNX backend? (Without GRPC and HTTPs calls.

CC @fpetrini15 @krishung5 if you're familar with support for in-process API on Windows