[Client] Description not populating
Describe the bug
When a registered model is created and description is given, it is not set.
mr.register_model(
name="test-model",
uri="s3://fake-bucket/fake-model",
version="0.0.3",
model_format_version="v1",
model_format_name="onnx",
description="A fake model",
metadata={
"accuracy": 0.95,
},
)
It returns:
RegisteredModel(name='test-model', id='1', description=None, external_id=None, create_time_since_epoch='1752003127919', last_update_time_since_epoch='1752003127919', custom_properties=None, owner='sglinton', state=<RegisteredModelState.LIVE: 'LIVE'>)
and when indexed directly at:
curl http://localhost:8000/api/model_registry/v1alpha3/model_artifact?name=test-model&parentResourceId=3
it returned:
{
"artifactType": "model-artifact",
"createTimeSinceEpoch": "1752003335640",
"customProperties": {},
"id": "2",
"lastUpdateTimeSinceEpoch": "1752003335640",
"modelFormatName": "onnx",
"modelFormatVersion": "v1",
"name": "test-model",
"state": "UNKNOWN",
"uri": "s3://fake-bucket/fake-model"
}
Expected behavior Description included in registered model python resource or API response
Additional context Latest version of MR at the time or writing
If I recall correctly it was requested to be affecting the ModelVersion.
A possible improvement here could be that if the RegisteredModel does not exists the description of the RegisteredModel creation take the same value.
On correcting this bug, we need to ensure when a new ModelVersion is created, if the RegisteredModel already exists, we do not override the value.
I recall possibly @dbasunag had the same question a while back, and this was clarified in the docs somewhere maybe?
Wdyt?
@tarilabs I raised similar question here: https://github.com/kubeflow/model-registry/issues/1027 and the discussion was captured in comments here https://github.com/kubeflow/model-registry/pull/1029#issuecomment-2835727254
thanks @dbasunag similar but not quite since there the PR would have added 2 more types of Description; I would avoid adding more description parameters, and potentially I would consider:
if the RegisteredModel does not exists the description of the RegisteredModel creation take the same value
wdyt?
@tarilabs I personally would prefer it stays aligned with model version description only. Otherwise, from automation perspective, it opens a number of possibilities for the same field and it can lead to confusion at the user end.
e.g. first time I register a model and the proposed change would apply the same description to both registered model and model version 1. Now I create another version with new description, however, my registered model is still showing me the same description for version 1 - this would be confusing. Wdyt?
@tarilabs @dbasunag Should we just remove the field (or omit empty)? It can be misleading.
I am fine with that.
@dbasunag I will drop a fix once experiment tracking client code is added to reduce potential refactors in both the returned model, base client, and public-facing client
Here is a possible solution, if anyone would like to drive this to completion
https://github.com/syntaxsdev/model-registry/commit/c7687906b20e3abbe50de8637420647e04e7d81b
if it's a source of confusion for the function register_model() I would rename the field from description to version_description to avoid the confusion?
@syntaxsdev , is it still open? If yes, will be good first one for me to get started
@ajit2704 Yes! This is still open.
Hey, I would like to help here. After reading the issues I got a little confused on the purpose of having one description for RegisteredModel and another for ModelVersion. When registering a model only the description in ModelVersion is set as far as I could notice here. As @syntaxsdev said these both may be misleading. I am seeing two possible solutions:
- Removing one of the fields.
- Having another field in
register_modelfor the missing description and set it in the object.
At the end include in the api endpoint. wdyt?
@nunoferna Good point! It sounds like we have some slightly varying opinions. Since we don't really use RegisteredModel.description, I think ideally we shouldn't have the field. But simply removing it would be a breaking change. It would also be tricky since we effectively inherit this description field from BaseModel and BaseResource in our server code.
I think my suggestion would be to:
- deprecate
RegisteredModel.descriptionin some form (model_registry.types.contexts) - later raise an error if this field is used (closest thing to removing)
Perhaps we shouldn't be inheriting from the openapi-generated RegisteredModel but at this point we are.
The deprecation could be done with a pydantic validator to prevent RegisteredModel(..., description="something") and maybe something like a @property that shows a deprecation warning if registered_model_instance.description is called
That looks good! Can I work on that?
Also, about the API response, should it include the description of the latest ModelVersion then?
and when indexed directly at:
curl http://localhost:8000/api/model_registry/v1alpha3/model_artifact?name=test-model&parentResourceId=3 it returned:
{ "artifactType": "model-artifact", "createTimeSinceEpoch": "1752003335640", "customProperties": {}, "id": "2", "lastUpdateTimeSinceEpoch": "1752003335640", "modelFormatName": "onnx", "modelFormatVersion": "v1", "name": "test-model", "state": "UNKNOWN", "uri": "s3://fake-bucket/fake-model" } Expected behavior Description included in registered model python resource or API response
Additional context Latest version of MR at the time or writing
My opinion is that we should leave the endpoint as is and simply change the python client's wrapper code. wdyt @syntaxsdev @tarilabs and others? Would the variance in fields between RegisteredModel and the REST api be a problem?
If the question is to deprecate the description parameter in the mr.register_model() method, I'm supportive.
If you want to also consider deprecating the mr.register_model() method name in favour of a future more descriptive name with semantic like mr.register_triplet_if_not_existing() ie mr.register(), I'm also supportive.
On the topic of the (pydantic) model, we need to keep alignment, and we need functionally to keep the ability to set/update the description of the entities in the Model Registry. ie I'm not supportive of removing description from RegisteredModel, ModelVersion, ModelArtifact, etc.
Since the confusion is about register_model(), I feel https://github.com/kubeflow/model-registry/issues/1284#issuecomment-3228463949 would resolve the confusion.
@nunoferna I was mistaken about RegisteredModel.description. We can't remove it. Feel free to take a stab at @tarilabs or @dbasunag 's suggestions. Renaming description kwarg on register_model() to version_description might be the simplest and most likely to be approved. (Though perhaps we need to leave description as an alias and have a deprecation warning so it's not a breaking change for now)
@jonburdo, just added a commit deprecating description to version_description. It has a warn() when description is set, if you don't want to use it please let me know. Also the noqa: C901 was because the pre-commit ruff format were not letting me commit that function for having >8 args.
@nunoferna Thank you for working on this! This looks like a good approach. Could you create a PR for this? We'll make sure CI passes and give it a review :)
And your commit message will just need to be signed off as explained here - You'll just need Signed-off-by: Your Name <[email protected]> where the email matches your primary github email.