Ryan McCormick comments

Results 157 comments of


                                            Ryan McCormick

trafficstars

Revert "Fixing StringTo uint32_t used only by tracing (#6883)"

Hi @rvroge, thanks for contributing the PR! CC'ing a couple folks who may have some more context @oandreeva-nv @nv-kmcgill53

model_repository_manager.cc:1186] failed to load 'bert' version 1: Internal: failed to load model 'bert': PytorchStreamReader failed reading zip archive: failed finding central directory

Hi @chenchunhui97, >> generate onnx for server (with torch version 2.1.2 ) If your `bert` model is an ONNX model, then you should be specifying the `onnxruntime` backend in the...

Add environment variable that allows you to append a prefix to all HTTP requests

This seems like a reasonable and relatively simple request to me. @GuanLuo @nnshah1 what do you think?

how to deploy BERT

Hi @chenchunhui97, this example is pretty old, but may work: https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/LanguageModeling/BERT/triton/README.md

fix: python backend support for numpy>=2

Testing PR: https://github.com/triton-inference-server/server/pull/7756

Issue while loading the model using TIS (Triton Inference Server) : For the model to support batching the shape should have at least 1 dimension and the first dimension must be -1

Hi @Vaishnvi, thanks for sharing such detailed info. Since this is an ONNX model, and the [ORT backend supports full config auto-complete](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#auto-generated-model-configuration), can you try to load the model without...

Ryan McCormick