Katherine Yang comments

Results 100 comments of


                                            Katherine Yang

TensorRT model low throughput

Hello @rs-ixz can you share the exact model you're using for TRT and Tensorflow? This is so there's no confusion in reproducing your results. If `CUDA_VISIBLE_DEVICES=8`, that should mean only...

TensorRT model low throughput

> @jbkyang-nvi the models are available in the archive I attached few responses ago (TRT_Slowness.zip) Thanks. Sorry I missed the zip file. How are you converting from tensorflow savedmodel to...

TensorRT model low throughput

Thanks for your quick response! While I'm working on a reproducer, can you try creating the model with ``` –optShapes flags to control the range of input shapes including batch...

TensorRT model low throughput

@rs-ixz can you also list the GPUs you are using for measuring perf?

Python Backend: one model instance over multiple GPUs

Hello, what does your model configuration file `config.pbtxt` look like? Also Triton's up to 24.03 right now. Is there a reason why you are not using the latest version?

Perf Analyzer Error: Cannot send stop request without specifying a request_id

Hello, your image did not upload. Can you specify your model? What version of Triton you are using etc? Aka the original bug report template here: **Description** A clear and...

Input data/shape validation

> Thanks for looking at this. > > Here is a full example that you can run with `pytest`. For demonstration purpose I use the identity function as model :`torch.nn.Identity()`...

client silent failure - E0422 05:03:24.145960 1 pb_stub.cc:402] An error occurred while trying to load GPU buffers in the Python backend stub: failed to copy data: invalid argument

Hello while we try to reproduce your issue, can you update your client + server to Triton 24.03? 23.04 is 1 year old and we don't really maintain containers that...

client silent failure - E0422 05:03:24.145960 1 pb_stub.cc:402] An error occurred while trying to load GPU buffers in the Python backend stub: failed to copy data: invalid argument

for `cuda-memory-pool-byte-size` is per GPU. As per the tritonserver cli: > The total byte size that can be allocated as CUDA memory for the GPU " "device. If GPU support...

Fix inference command sample in README.md

CLA is approved. @jasoncwik can you rebase as well?