openvino_notebooks Serving the OpenVINO Model In OpenShift

I was going through the Clip Zero-Shot Image Classification section and I replicated the notebook into my OpenShift Data Science Hub. As per the notebook instructions, I converted the PyTorch model into the OpenVINO IR model.

But then I was unable to find any guide to use the XML and bin files to serve the model via the Model Server of the Data Science Hub. Can anyone please write the steps and guide me on how can I serve the model and get the results back?

I checked various other models also and their bin files are way less than 20MB whereas the Clip Model 16-bit variant is of 360 MB 😯, while the 8-bit variant is of149 MB. What is the reason for the exponential growth of size in comparison to others?

Please help me out and get this thing cleared ASAP

Nov 09 '23 07:11 ChamanSahil

Do you mean something like this?

"https://docs.openvino.ai/2021.4/ovms_extras_openvino-operator-openshift-readme.html"
"https://developers.redhat.com/learning/learn:openshift-data-science:get-started-intel-openvino/resource/resources:start-your-jupyter-notebook-server-intel-openvino"
"https://developers.redhat.com/learn/openshift-data-science/get-started-intel-openvino"
"https://www.intel.com/content/www/us/en/developer/articles/technical/red-hat-openshift-data-science-with-intel-ai-tools.html"

Nov 09 '23 09:11 brmarkus

You could write a whole master thesis about the models and their conversion and quantization. Some operations could be optimized, others couldn't when converting or quantizing. But generally seen, the size reduction with conversion between FP32 (floating point 32bit) to FP16 to INT8 to INT4 is almost "linear".

Nov 09 '23 09:11 brmarkus

After deploying the model into Model Server, this is the warning that I am seeing. I guess this is the reason why I am not able to get the inference endpoint working. Is it so? Can anyone help me with this?

Also, can anyone guide me on maintaining a valid data connection for the Model Server of an AWS S3 bucket? Currently I am using the following values:

These are my S3 model files 👇: S3 Bucket

Nov 09 '23 10:11 ChamanSahil

Just saw that @raymondlo84 has posted "OpenVINO and Red Hat OpenShift! This time we showcased Llama2 (INT8 and INT4!!!!!) on GPU+CPU, LCM Stable Diffusion, and OpenVINO notebooks on Red Hat Open Shift :) Thanks everyone making it real." on LinkedIn.

@raymondlo84 maybe you can forward the question?

Nov 10 '23 08:11 brmarkus

Closing this as Ria had followed up.

Aug 27 '24 19:08 raymondlo84

openvino_notebooks openvino_notebooks copied to clipboard

Serving the OpenVINO Model In OpenShift

openvino_notebooks
openvino_notebooks copied to clipboard