Ryan McCormick
Ryan McCormick
Hi @OvervCW, for simplicity since we didn't receive a CLA and recently had some major changes around the structure of `main.cc`, I've filed a similar PR here to add this...
My apologies @OvervCW, we do appreciate the contribution nonetheless! For future PRs please do let us know that a CLA was already submitted, as it helps the team verify more...
Hi @ClifHouck, thanks for this contribution! While you have figured out a way to have the existing logic propagate the GPU labels to the generic per-model inference metrics - I...
> (1) Clearly MetricModelReporter expected metrics to be decisively enabled or disabled by the time that InferenceServer::Init is called. I think that's a reasonable thing to expect. `lserver->Init()` initializes most...
Thanks for the submission @HennerM ! We're looking into this PR and the underlying root causes or edge cases throughout the Sequence Batch Scheduler.
CC @GuanLuo @whoisj
Hi @okyspace, we have the restricted endpoint feature for both HTTP and GRPC endpoints: https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/inference_protocols.md#limit-endpoint-access-beta. You should be able to setup key/value pairs to authorize specific routes/features. Does this satisfy...
Hi @okyspace, I don't believe it is currently possible to restrict specific models. The workaround would likely be to start separate tritonservers for each logical set of models you would...
Looks like this is also a duplicate of https://github.com/triton-inference-server/server/pull/6099
@is did you fill out a CLA as described here: https://github.com/triton-inference-server/server/blob/main/CONTRIBUTING.md#contributor-license-agreement-cla?