Jacky
Jacky
Hi @manhtd98, did the client code ran successfully? Can you share the entire server output?
We twisted the code a bit so the above scenario will work. Please refer to the [Python backend documentation](https://github.com/triton-inference-server/python_backend#important-notes) for up-to-date limitation when using the "EXECUTION_ENV_PATH", as we progressively improve...
Hi, the limitation should be java client specific (unless explicitly mentioned by other clients). For example, this [python grpc client example](https://github.com/triton-inference-server/client/blob/main/src/python/examples/simple_grpc_infer_client.py) sends two synchronous inference at [line 126](https://github.com/triton-inference-server/client/blob/main/src/python/examples/simple_grpc_infer_client.py#L126) and [line...
From the log: failed to load model 'test'. This could have caused the 'resize' model to be unloaded. Removing the 'test' model from the model repository might solve this issue....
Filed a ticket for us to investigate further.
Please see point 4 and 5 on the documentation regarding relative path limitations: https://github.com/triton-inference-server/python_backend#important-notes If you need improvements on relative path handling, please open a feature request: https://github.com/triton-inference-server/server/issues/new/choose
@tanmayv25 @oandreeva-nv Do you have some insights into loading a YOLOv8 model with the EfficientNMS_TRT plugin in TRITON?
Sure, added a ticket for testing the scenario above.
We have done some experiment and finds the asymmetry between streaming and non-streaming is depending on whether the model is decoupled or non-decoupled - the cancel event is not triggered...
Hi @NiklasA11, thanks for the suggestion! From a quick reading on our current implementation, I think a signature of the model is picked upon loading the model on Triton, which...