modelmesh-serving Support out of distribution detection metrics

If an OOD enabled model is deployed, model mesh metrics should capture the two additional metrics that these models generate as part of the inferencing metrics.

Feb 25 '23 15:02 taneem-ibrahim

OOD enabled model will produce in a single output tensor:

Original model inferencing output
OOD score

We would need output transformation to separate 1 from 2 and logging to record the input/output and OOD scores (e.g., in OpenShift logging and/or Prometheus). These are generic functionalities that should be useful for things beyond OOD.

Feb 28 '23 21:02 mudhakar

Hi, @mudhakar @taneem-ibrahim , just to add more details on OOD (model certainty) enablement and deployment.

To get an certainty enabled model, the certainty container will take in the original model and a in-distribution (normal) dataset, then the output is a modified model which will be stored at user-specified location. The modified model is capable of generating the original model inference output and a certainty score.
The modified model can be deployed just like a regular model (right diagram). To take advantage of the certainty score, an output transformation can be used to extract the score; then the certainty score can be logged into Prometheus or other relevant services for model monitoring, dashboarding, etc.

Mar 07 '23 23:03 spacew

Per a discussion with @njhill and @ckadner, the best path forward is to have an output transformer (similar to post-processing transformer in k-serve) native to model-mesh, without requiring k-serve controller.

Mar 21 '23 02:03 nirmdesai

@nirmdesai @mudhakar After further discussion with @njhill and @ckadner , sounds like our fastest way to get integrated would be to add a custom post processor as part of OOD for now until we have kserve-raw or serverless available in ODH.

Apr 19 '23 16:04 taneem-ibrahim

A proposal for the post-processing transform

Apr 25 '23 17:04 daw3rd

Thanks @daw3rd. @njhill , @taneem-ibrahim , @ckadner: The above "KServe Proxy" is the custom post-processor container you proposed last week. Could you please review and confirm this is what you had in mind? cc: @mudhakar

Apr 25 '23 17:04 nirmdesai

Hi @nirmdesai Is the kserve proxy (rest server) here replicating functionality similar to this?

May 02 '23 03:05 taneem-ibrahim

@taneem-ibrahim: Just to be precise, we are not going to use K-Serve transformer framework (shown in the link you shared) in implementing K-Serve Proxy. However, the implementation of our K-Serve Proxy will look similar to a typical pre-/post processor function shown in the example above. Also, the deployment flow will be different from the link you shared wherein the transformer is deployed along with InferenceService creation. In our case, you would first create an InferenceService as you would normally, and on top of that deploy the proxy container. Then you would use the Proxy APIs for inferences instead of using the InferenceService APIs for inferencing. cc: @mudhakar , @daw3rd , @spacew

May 02 '23 11:05 nirmdesai

Hello @taneem-ibrahim @nirmdesai @mudhakar @spacew @daw3rd cc: @njhill @ckadner

In regards to a proxy service for transforming model output for a certainty-enabled model, below is a diagram demonstrating the interaction for a modelmesh proxy server deployed on the openshift to the same cluster as where RHODS is hosted. Note that in the deployment, we also deploy a Prometheus service for logging the model-certainty metrics overtime as generated by the modelmesh proxy service - both are packaged via helm install, however, if a Prometheus instance already exists, this can be removed.

Please share feedback or comments on the deployment and sequence steps, as well as the endpoint for reaching the modelmesh proxy.

KServeProxy-revised

May 22 '23 19:05 swith005

Copying discussions I've had on Slack:

I think TrustyAI can provide a lot of the capabilities that the modelmesh-proxy is aiming to provide, which would provide the advantage of not needing to add another component into the mix

TrustyAI within ODH/RHODS is a service that intercepts modelmesh inputs and output payloads and then sends metrics computed on that input/output data (e.g., fairness metrics) to Prometheus. If we defined a metric that simply grabbed the certainty scores from the model output payload and emitted them to Prometheus as a metric, it'd be a really simple way of doing what you're trying to do.

As a PoC, I've done exactly that and got an OOD model deployed in modelmesh and sending the OOD metrics to Prometheus within OpenDataHub:

Jun 21 '23 13:06 RobGeada

modelmesh-serving modelmesh-serving copied to clipboard

Support out of distribution detection metrics

modelmesh-serving
modelmesh-serving copied to clipboard