Trawinski, Dariusz

Results 52 comments of Trawinski, Dariusz

our direction is to drop the dependency on the sudo in the container. Demos should not require it. In new images from the coming releases, it should be improved. The...

That effort is in progress now. A draft of the PR is in https://github.com/triton-inference-server/openvino_backend/pull/74

Once the PR https://github.com/triton-inference-server/openvino_backend/pull/72 is merged it will be possible to use the models with dynamic shape. Note that with the dynamic shape on the model input, you don't need...

@mbahri You could use the dynamic batching but it will not of top efficiency - it will still use the batch padding. You can expect better throughput results by using...

@siretru I think you shouldn't use the normalization. That model expect in input data in a range of 0-255. Try dropping that division by 255 in preprocessing.

The commands look correct. I'm just not sure if the difference between the model name in the export and deployment is accidental. I assume the command to export model was:...

https://github.com/triton-inference-server/openvino_backend/pull/101 a couple of more changes is needed to compile the backend.

@sriram-dsl Duplicating the models is not the recommended method for scalability and multi concurrency. Nireq parameter is also not causing the parallel processing - it is a size of the...

@sriram-dsl you can use perf_analyser. Kserve API in OVMS is compatible with Triton so the same benchmarking tool can be used. You can use also this tool https://github.com/openvinotoolkit/model_server/tree/main/demos/benchmark/python. In Kubernetes...

@sriram-dsl did you get some error message with "ENABLE_CPU_RESERVATION": "true" ? I suspect that you have old ovms version. Can you reproduce it with 2025.1? Without the reservation parameter, each...