ml-commons
ml-commons copied to clipboard
[Meta] Add model Interface
Problem statement
Currently, when users would like to use a model in ml-commons prediction(in prediction API), we assume the users know the model prediction API input and output format. If a user is not familiar with the model, the user need to look up the model source, outside of OpenSearch, for example, huggingface, aws docs, OpenAI.com. If users sends a wrong format input, they will encounter Illegal Argument Exception.
To make it easier for user to learn what's the ml-commons model interface (input/output parameter list and type). User can refer to the model interface to figure out what input they should provide and what output they should expect. This mapping will also benefit validation checks in multiple tools, processors and agents that use machine learning models for predictions.
Flow Framework team is working some drag-and-drop UX. They can check if multiple components can match or not based on model interface. For example , component A will flow to B, then A's output should match B's input.
Scope:
- Developers can define, update, get a model input and output mappings.
- Input and output mappings for models are optional model configs, similar to OpenSearch document mappings, it’s not required for a model, but it will be helpful to standardize the prediction input and output format.
- User can set up permission on who can update a model input/output mappings.
- Users can set input/output mapping in connector as a default mappings, it’s optional
- Users can set up override the mappings in model configs during the model registration calls.
Out of Scope:
- This mapping feature focus on setting up mapping input and output schema, do not handle format transformation, for example, convert a string into a list of string, it will be separated out of this mappings feature.
- The mapping feature supports remote models, does not support local models. The local model input dataset would be refactored in the near future.
Can you add an example to this? An concrete input and output of a model.
Can you add an example to this? An concrete input and output of a model.
Remote model (openAI):
"interface": {
"input": {
"properties": {
"parameters": {
"properties": {
"messages": {
"type": "string",
"description": "This is a test description field"
}
}
}
}
},
"output": {
"properties": {
"inference_results": {
"type": "array",
"items": {
"type": "object",
"properties": {
"output": {
"type": "array",
"items": {
"properties": {
"name": {
"type": "string",
"description": "This is a test description field"
},
"dataAsMap": {
"type": "object",
"description": "This is a test description field"
}
}
},
"description": "This is a test description field"
},
"status_code": {
"type": "integer",
"description": "This is a test description field"
}
}
},
"description": "This is a test description field"
}
}
}
}
Local model (text embedding):
"interface": {
"input": {
"properties": {
"text_docs": {
"type": "array"
},
"return_number": {
"type": "boolean"
},
"target_response": {
"type": "array"
}
}
},
"output": {
"properties": {
"inference_results": {
"type": "array",
"items": {
"type": "object",
"properties": {
"output": {
"type": "array",
"items": {
"properties": {
"name": {
"type": "string",
"description": "This is a test description field"
},
"data": {
"type": "array",
"items": {
"type": "number",
"description": "This is a test description field"
},
"description": "This is a test description field"
}
}
},
"description": "This is a test description field"
}
}
},
"description": "This is a test description field"
}
}
}
}