ml-commons icon indicating copy to clipboard operation
ml-commons copied to clipboard

[Enhancement] Generated remote model interfaces should include detailed model outputs

Open ohltyler opened this issue 1 year ago • 2 comments

The current generated model output interfaces just include the boilerplate. The actual useful model outputs will be under dataAsMap, but this is always defined as a generic object with no sub-properties.

"inference_results": {
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "output": {
                "type": "array",
                "items": {
                    "properties": {
                        "name": {
                            "type": "string",
                            "description": "This is a test description field"
                        },
                        "dataAsMap": {
                            "type": "object",
                            "description": "This is a test description field"
                        }
                    }
                },
                "description": "This is a test description field"
            },
            "status_code": {
                "type": "integer",
                "description": "This is a test description field"
            }
        }
    },
    "description": "This is a test description field"
}

The dataAsMap field should be further defined to include the actual model outputs

ohltyler avatar Sep 13 '24 17:09 ohltyler

@ohltyler Can you check https://github.com/opensearch-project/ml-commons/pull/2689/files to see if you are satisfied with the current granularity for our output interface

b4sjoo avatar Sep 16 '24 18:09 b4sjoo

Got it, thanks for the reference! I've tested with some other blueprints, and it seems to work as expected. I think the confusion I was having was around how much flexibility can be made to the connectors, before it stops producing the detailed interfaces. The documentation states:

The predefined model interface is generated based on the connector blueprint and the model's metadata, so you must strictly follow the blueprint when creating the connector in order to avoid errors.

For example, I had some blueprints that had a supported model defined in-line, but not included as one of the parameters. Or some models that weren't part of the list I'm seeing defined here.

From that code, it looks like a blueprint will be generated automatically if & only if:

  1. There is a service_name and model defined under parameters
  2. The values of service_name / model are included in the switch statement
  3. All other parts of the blueprint (the protocol/credentials/actions/request body etc.) can be anything

If so, can we update the documentation to reflect this a bit more, on what flexibility and what restrictions are put in place? It would be helpful to know how users can still have custom request bodies, for example, but still leverage automated model interfaces.

ohltyler avatar Sep 16 '24 18:09 ohltyler