ml-commons [RFC][FEATURE] Model alias

Introduction:

Right now our model versions are identified by random IDs. Meanwhile, most of our ML-related behaviors at this time require user to refer the model ID directly, which are hard to remember and use due to its abstractness. This can lead to confusion and errors when the users want to refer to a specific model or compare different models. To address this issue, we propose to design and implement a model alias feature for our ml-commons. This feature will allow users to assign a custom alias to each model they create or use, and then use this alias as a reference instead of the model ID. For example, a user can name their model "text_embedding" or "sentiment_analysis" and then use these alias in their queries or commands.

Benefits of this feature:

It will improve the user experience and satisfaction by making it easier and more intuitive to work with machine learning models.
It will reduce the risk of errors and mistakes by avoiding confusion between different models or using the wrong model ID.
It will enable more efficient and effective communication and collaboration among the users who share or exchange models.

Where model id is being used now:

The get/delete/deploy/undeploy/predict model API required model ID right now.
In Neural Search Plugin, in order to ingest vectorized documents, you need to create a Neural Search pipeline. A pipeline consists of a series of processors that manipulate documents during ingestion, allowing the documents to be vectorized. The following example API operation creates a Neural Search pipeline:

PUT _ingest/pipeline/nlp-pipeline
{
  "description": "An example neural search pipeline",
  "processors" : [
    {
      "text_embedding": {
        "model_id": "bxoDJ7IHGM14UqatWc_2j",
        "field_map": {
           "passage_text": "passage_embedding"
        }
      }
    }
  ]
}

Neural search query also needs model id, for example

            "query": {
              "neural": {
                "passage_vector": {
                  "query_text": "Hello world",
                  "model_id": "xzy76xswsd",
                  "k": 100
                }
              }
            },

Pain point

Suppose the user have a new model version and they want to use it to replace the old version. They need to update the pipeline by changing the model_id. User also need to change model id in neural search query.

Another example: User build some application with OpenAI remote model, and later they prefer to move to another model like Claude model, then user have to change the model id in their application code and redeploy. That will have service unavailable window. With model alias, user can simply move alias from one model to another, then all request will be routed to the new model, no need to redeploy, no service unavailable window.

Solution HLD:

Option 1: The model alias is globally unique

Plan:

Model alias is an optional field when the user register a model (null is a legal value)
User can give or modify(including remove) model alias anytime through model update API
Every time when user create or modify the model alias, we will initiate a global uniqueness checking towards the model alias given by the user. If it does not meet the global uniqueness requirement, an exception will be thrown.
User can re-link model alias to a different model through model update API

pros:

This solution does not require too much changes in the backend
This solution requires less checks

cons:

Generally user intends to create intuitive and meaningful names, which are very likely to be similar and can easily cause collision. This indicates user frustration when alias number grows.
The user may need to keep update model alias once they update their model.

Solution LLD:

Plan:

This solution mainly involved in three methods: create-alias, update-alias, and delete-alias
Global uniqueness check: we check both on model ID and model aliases.
Create-alias: API endpoints use model ID to locate the model. We first initiate a global uniqueness checking. If it does not meet the global uniqueness requirement, an exception will be thrown. If global uniqueness check passed, we return a success message.
Update-alias: API endpoints use model ID to locate the model. We first check if the model group has the alias. If yes we relink the alias. If not we throw an exception.
Delete-alias: We check if user can access the model, if yes then the corresponding alias can be removed and be released to public

Security

DDOS
Abuse of update-alias method (Use update-alias without communicating with collaborators)

Backward Compatibility

This solution does not involve in any breaking changes, so backward compatibility won’t be a problem here.

Testability

The logic is pretty straightforward, so testing won’t be a problem here. But if we want we can let neural search team to test the feature before release, to gather some metrics and feedback.

Aug 18 '23 10:08 b4sjoo

@b4sjoo This is a great idea, love it!

In the case where the alias namespace gets crowded, it would be very helpful to have some kind of search-aliases API to see what names are taken, and perhaps where they point so I don't end up duplicating these unnecessarily.

Another option to consider could be to have alias names be unique on some kind of access-control basis: it could be very frustrating to try to create aliases but then be unable because someone's already done so in an access level above you. I'm not sure that's a real thing, but something to think about.

Can you explain the LLD update-alias in more detail? I'm having a little trouble understanding. Can a model have multiple aliases associated with it?

Thanks

Aug 18 '23 21:08 HenryL27

Will connectors get aliases, too?

Aug 25 '23 07:08 austintlee

+1 this will be a great user experience and will help with model versioning and MLClient interactions.

Couple of suggestions and questions:

Global uniqueness check: we check both on model ID and model aliases.

Curious where the global alias metadata is stored and how.

Option 1: The model alias is globally unique

Did you consider other options? Like unique model aliases per model group. Though I like the approach of globally unique aliases, it would be nice to compare alternatives.
With introduction of model access control[1], control is operated on model groups. How does it work with aliases being introduced?
model_id is mandatory today, how do we handle when both model_id and model_alias are present? It would be good to clarify that.

            "query": {
              "neural": {
                "passage_vector": {
                  "query_text": "Hello world",
                  "model_id": "xzy76xswsd",
                  "model_alias": "neural_search_text_embedding",
                  "k": 100
                }
              }
            },

Where model id is being used now

Update MLClient?

[1] https://github.com/opensearch-project/ml-commons/blob/2.x/docs/model_access_control.md

Aug 28 '23 22:08 saratvemulapalli

Will connectors get aliases, too?

I think we can add alias for connector too. @austintlee , Will it be useful? Any concern?

Aug 31 '23 18:08 ylwu-amzn

What if the model_id could just be manually specified instead of randomly generated?

Instead of creating alias functionality around the ID, you could just specify the ID of your choosing when you create the various objects. If the ID attribute is left off, then it's auto-generated as it is today, but when specified the ID is used. Users could then decide if they want to use a "text-friendly" ID (e.g. "neural_search_text_embedding_model"), a UID of their own or the one OpenSearch creates.

Would that simplify things?

There's probably some technical reasons why this might not be feasible (and why it wasn't done from the start), but I'm just wondering if this wouldn't simplify things if there are no technical reasons why the ID cannot just be specified during creation.

Dec 12 '23 12:12 dswitzer

+1 for this feature.

I'm trying to work on scripts so we can provision our servers. We deploy to varies different environments, so having IDs that are consistent across deployments makes the chain much easier to manage, especially since our indices are created on-the-fly.

Right now, there's no simple solution to referencing the model workflow, since the IDs will be different on every instance. The application stack needs to search of the registered models by name and find their IDs and then store that for use in the application and then we need to look it up any time we need to run some one-off queries.

Having the ability to use an "alias" (or define our own ID) would definitely simplify our workflow.

If you need more details on our provisioning workflow issue, there's a thread here: https://forum.opensearch.org/t/provisioning-models-for-deployment-in-application/17036

Dec 12 '23 12:12 dswitzer

+1 for the problem -1 for the solution

I just posted something like this on the slack channel: https://opensearch.slack.com/archives/C05BGJ1N264/p1718823006253339

why not make model name unique in the same way that an index is unique?

adding yet another layer of indirection -- alias -- seems to be going in the wrong direction, which is ease-of-use

if you're going to add aliases, why not just use @ notation for them like "model" : "@mymodelalias"or mustache "{{{mymodelalias}}}" . then you can use them anywhere, like for connections and indexes.

Jun 19 '24 21:06 rbpasker

Thanks @rbpasker ,

why not make model name unique in the same way that an index is unique?

This is some historical issue. The model name was designed not as identifier (no uniqueness check) at first. Considering we have many users are using this and they may have models which same name. If we change the model name design to be an identifier, will be challenging to support backward compatible (for example, user have two models with same name, if they use same model nam, we don't know which model should be used). Adding an alias is easier for supporting backward compatible.

Jul 11 '24 23:07 ylwu-amzn

I agree with @rbpasker the current proposed solution will add more friction and complexity.

will be challenging to support backward compatible (for example, user have two models with same name, if they use same model nam, we don't know which model should be used). Adding an alias is easier for supporting backward compatible.

easier != the best solution

Why not something like this?: if models.get(name).size > 1 then fail("there are 2 models with the same name please update accordingly) else do whatever

Jul 12 '24 13:07 IanMenendez

What purpose does a model name even serve, if it is not unique, is never used by the system anywhere else,and there is a separate model description?

Alternatively, make model name optionally unique. With no change to the model definition, it works exactly as now. By adding "model-unique":true to the model definition, system would do a uniqueness check, and return HTTP 409 Conflict if it already exists https://www.rfc-editor.org/rfc/rfc7231#section-6.5.8

This has the benefits of backward compatibility and having the smallest surface area for implementation and testing hi

Jul 12 '24 16:07 rbpasker

@rbpasker Thanks for your suggestion. Sounds an option.

Want to clarify some details

By adding "model-unique":true , system would do a uniqueness check, and return HTTP 409 Conflict if it already exists

Do you mean add another field model-unique in model or add a cluster setting to check uniqueness for all models?

Adding a field to model will be "smallest surface area for implementation and testing".

Jul 20 '24 20:07 ylwu-amzn

@ylwu-amzn i updated the comment to refer specifically to the model definition.

Jul 20 '24 20:07 rbpasker

ml-commons ml-commons copied to clipboard

[RFC][FEATURE] Model alias

Introduction:

Benefits of this feature:

Where model id is being used now:

Pain point

Solution HLD:

Solution LLD:

ml-commons
ml-commons copied to clipboard