Trawinski, Dariusz comments

Results 52 comments of


                                            Trawinski, Dariusz

`sudo` required to run demos

our direction is to drop the dependency on the sudo in the container. Demos should not require it. In new images from the coming releases, it should be improved. The...

Enable iGPU/dGPU plugin of OpenVINO

That effort is in progress now. A draft of the PR is in https://github.com/triton-inference-server/openvino_backend/pull/74

Fully-working example with dynamic batching

Once the PR https://github.com/triton-inference-server/openvino_backend/pull/72 is merged it will be possible to use the models with dynamic shape. Note that with the dynamic shape on the model input, you don't need...

Fully-working example with dynamic batching

@mbahri You could use the dynamic batching but it will not of top efficiency - it will still use the batch padding. You can expect better throughput results by using...

Using Intel OpenVINO models doesn't provide good results

@siretru I think you shouldn't use the normalization. That model expect in input data in a range of 0-255. Try dropping that division by 255 in preprocessing.

Error during llm node initialization for models_path

The commands look correct. I'm just not sure if the difference between the model name in the export and deployment is accidental. I assume the command to export model was:...

upgrade OpenVino to 2025.1.0

https://github.com/triton-inference-server/openvino_backend/pull/101 a couple of more changes is needed to compile the backend.

Significant Inference Time Increase with Multiple Models in OpenVINO Model Server

@sriram-dsl Duplicating the models is not the recommended method for scalability and multi concurrency. Nireq parameter is also not causing the parallel processing - it is a size of the...

Significant Inference Time Increase with Multiple Models in OpenVINO Model Server

@sriram-dsl you can use perf_analyser. Kserve API in OVMS is compatible with Triton so the same benchmarking tool can be used. You can use also this tool https://github.com/openvinotoolkit/model_server/tree/main/demos/benchmark/python. In Kubernetes...

Significant Inference Time Increase with Multiple Models in OpenVINO Model Server

@sriram-dsl did you get some error message with "ENABLE_CPU_RESERVATION": "true" ? I suspect that you have old ovms version. Can you reproduce it with 2025.1? Without the reservation parameter, each...