model-registry icon indicating copy to clipboard operation
model-registry copied to clipboard

Model Registry API Simplification Question

Open rareddy opened this issue 9 months ago • 15 comments

I have a question about the API I do feel it is verbose in certain aspects, so I left some questions on side. Can someone please answer?

Path Methods Supported Question ?
/api/model_registry/v1alpha3/artifact GET could have been renamed like byname
/api/model_registry/v1alpha3/artifacts GET, POST OK
/api/model_registry/v1alpha3/artifacts/{id} GET, PATCH OK
/api/model_registry/v1alpha3/inference_service GET OK
/api/model_registry/v1alpha3/inference_services GET, POST Is POST client operation?
/api/model_registry/v1alpha3/inference_services/{inferenceserviceId} GET, PATCH Is PATCH client operation?
/api/model_registry/v1alpha3/inference_services/{inferenceserviceId}/model GET OK
/api/model_registry/v1alpha3/inference_services/{inferenceserviceId}/serves GET, POST Is POST client operation?
/api/model_registry/v1alpha3/inference_services/{inferenceserviceId}/version GET OK
/api/model_registry/v1alpha3/model_artifact GET is this not duplicate?
/api/model_registry/v1alpha3/model_artifacts GET, POST is this not duplicate?
/api/model_registry/v1alpha3/model_artifacts/{modelartifactId} GET, PATCH is this not duplicate?
/api/model_registry/v1alpha3/model_version GET OK
/api/model_registry/v1alpha3/model_versions GET, POST OK
/api/model_registry/v1alpha3/model_versions/{modelversionId} GET, PATCH OK
/api/model_registry/v1alpha3/model_versions/{modelversionId}/artifacts GET, POST, PATCH (on subresource) POST,PATCH seem verbose
/api/model_registry/v1alpha3/model_versions/{modelversionId}/artifacts/{artifactId} PATCH verbose
/api/model_registry/v1alpha3/registered_model GET could be renamed like byname?
/api/model_registry/v1alpha3/registered_models GET, POST OK
/api/model_registry/v1alpha3/registered_models/{registeredmodelId} GET, PATCH OK
/api/model_registry/v1alpha3/registered_models/{registeredmodelId}/versions GET, POST POST is verbose operation, what is the significance?
/api/model_registry/v1alpha3/serving_environment GET OK
/api/model_registry/v1alpha3/serving_environments GET, POST OK
/api/model_registry/v1alpha3/serving_environments/{servingenvironmentId} GET, PATCH OK
/api/model_registry/v1alpha3/serving_environments/{servingenvironmentId}/inference_services GET, POST is POST client operation? still verbose

IMO Artifact and ModelArtifact are verbose, which was evident while working on #1009 combined with DocArtifact. Any strong reasons for having these many derivations of the same? A discriminator enum could be used to define the same very easily.

I assumed anything to do with InferenceService is a GET for the end user, but what do POST or PATCH operations do? do they give client a way to deploy an inference service directly from MR?

I think a clear and concise API without duplication is preferable IMO, wdyt?

rareddy avatar May 03 '25 23:05 rareddy

I assumed anything to do with InferenceService is a GET for the end user, but what do POST or PATCH operations do? do they give client a way to deploy an inference service directly from MR?

The inferenceservice controller uses the REST API, POST and PATCH endpoints to create/update the inferenceservice records in the DB @rareddy

Al-Pragliola avatar May 05 '25 13:05 Al-Pragliola

The inferenceservice controller uses the REST API, POST and PATCH endpoints to create/update the inferenceservice records in the DB @rareddy

What is the significance of adding to the DB from the MR user perspective?

rareddy avatar May 05 '25 15:05 rareddy

The inferenceservice controller uses the REST API, POST and PATCH endpoints to create/update the inferenceservice records in the DB @rareddy

What is the significance of adding to the DB from the MR user perspective?

None, but is the MR user the only consumer of the MR API? Shouldn't the API contract be designed for general-purpose use, enabling different personas to perform a variety of use cases?

Rather than tailoring the API to a single user type, I think in an ideal world RBAC should dynamically dictate access to endpoints.

Having said that, yes, I also think we should reduce the number of endpoints a bit.

Al-Pragliola avatar May 05 '25 15:05 Al-Pragliola

What is the significance of adding to the DB from the MR user perspective?

In the specific case of Isvc, the Isvc logical model entity was added to support the use-case of Audit for the given model deployment (history). ie.: was this model deployed in the past at some point?

Since the K8s API does not offer an easy/ootb mechanism to query for past Resources (in this case, KServe Isvc) it was designed with this scope in mind.

An interesting aspect is that this use-case did not find practical application, while instead people are using the "current" Isvc (in MR) as a quicker mechanism to know where to find the Inference endpoints--rather than querying the K8s API. :)

fwiw I concur with @Al-Pragliola , possibly there are simplification worth considering, also given the experience collected so-far.

tarilabs avatar May 05 '25 15:05 tarilabs

None, but is the MR user the only consumer of the MR API? Shouldn't the API contract be designed for general-purpose use, enabling different personas to perform a variety of use cases?

Without a valid use case, how this is helping or used, IMO, these kinds of methods are confusing. If it is a general purpose, for example, from the controller, if you want to control something, I see a need for a separate API endpoint perhaps?

Rather than tailoring the API to a single user type, I think in an ideal world RBAC should dynamically dictate access to endpoints

I am only thinking from end user perspective; if there levels of end user like data scientists, ml engineers, admins etc and we need to restrict access, then the above makes sense to me.

An interesting aspect is that this use-case did not find practical application, while instead people are using the "current" Isvc (in MR) as a quicker mechanism to know where to find the Inference endpoints--rather than querying the K8s API. :)

I dig that! MR being a central place for information, masking the implementation detail to reach GET operation does make more sense to keep it

Having said that, yes, I also think we should reduce the number of endpoints a bit.

I would like your input on what we can do; above is my attempt. Since we are going to extend with catalog and storage aspects, having a concise and clear API might be helpful. My 2 cents :)

Thank you @tarilabs @Al-Pragliola for your replies.

rareddy avatar May 05 '25 16:05 rareddy

May I drop the bomb? @rareddy @tarilabs

Path Methods Supported
/api/model_registry/v1alpha3/artifacts GET, POST
/api/model_registry/v1alpha3/artifacts/{id} GET, PATCH
/api/model_registry/v1alpha3/inference_services GET, POST
/api/model_registry/v1alpha3/inference_services/{inferenceserviceId} GET, PATCH
/api/model_registry/v1alpha3/model_versions GET, POST
/api/model_registry/v1alpha3/model_versions/{modelversionId} GET, PATCH
/api/model_registry/v1alpha3/registered_models GET, POST
/api/model_registry/v1alpha3/registered_models/{registeredmodelId} GET, PATCH
/api/model_registry/v1alpha3/serving_environments GET, POST
/api/model_registry/v1alpha3/serving_environments/{servingenvironmentId} GET, PATCH

Al-Pragliola avatar May 06 '25 12:05 Al-Pragliola

I like it, concise without losing any functionality overall.

rareddy avatar May 06 '25 13:05 rareddy

@Al-Pragliola tongue-in-cheek at that point I'd question also why a ServingEnvironment is needed, since we care about the InferenceService (tied to a ModelVersion) and the namespace could be marked directly inside the InferenceService logical-model entity 😏

tarilabs avatar May 06 '25 13:05 tarilabs

@Al-Pragliola tongue-in-cheek at that point I'd question also why a ServingEnvironment is needed, since we care about the InferenceService (tied to a ModelVersion) and the namespace could be marked directly inside the InferenceService logical-model entity 😏

serving_environment -> namespace field in the inference_service

in doing that we should also drop the uniqueness of the name, because two inferenceservices can have the same name in two different namespaces @tarilabs

Path Methods Supported
/api/model_registry/v1alpha3/artifacts GET, POST
/api/model_registry/v1alpha3/artifacts/{id} GET, PATCH
/api/model_registry/v1alpha3/inference_services GET, POST
/api/model_registry/v1alpha3/inference_services/{inferenceserviceId} GET, PATCH
/api/model_registry/v1alpha3/model_versions GET, POST
/api/model_registry/v1alpha3/model_versions/{modelversionId} GET, PATCH
/api/model_registry/v1alpha3/registered_models GET, POST
/api/model_registry/v1alpha3/registered_models/{registeredmodelId} GET, PATCH

Al-Pragliola avatar May 06 '25 14:05 Al-Pragliola

I do not want to derail the conversation, however, consider the use case when in the future we will need to extend the serving_environment to represent not just in a cluster environment, but also bring in the environment from another cluster or a separate standalone environment.

rareddy avatar May 06 '25 15:05 rareddy

I do not want to derail the conversation, however, consider the use case when in the future we will need to extend the serving_environment to represent not just in a cluster environment, but also bring in the environment from another cluster or a separate standalone environment.

it will indeed pose the question of how you distinguish a ServingEnvironment called ("Production") that might exists in a cluster and not-in-a-cluster, since name ought to be unique; also, what are the current/to-be Requirement for ServingEnvironment having a dedicated entity are not fully spelled out.

I concur not to derail the convo, the point made by @Al-Pragliola here has merits, in my view.

tarilabs avatar May 06 '25 15:05 tarilabs

what are the current/to-be Requirement for ServingEnvironment having a dedicated entity are not fully spelled out.

I do not have an opinion on the Serving Environment in particular at this time as I did not take a look at the use case, but a requirement to recognize/represent/manage different environments is needed for MR. If Serving Environment concept fills need now, I would consider.

rareddy avatar May 06 '25 16:05 rareddy

- serving_environment -> namespace field in the inference_service
+ serving_environment -> environment field in the inference_service

I concur not to derail the convo, the point made by @Al-Pragliola here has merits, in my view.

tarilabs avatar May 06 '25 17:05 tarilabs

  • serving_environment -> namespace field in the inference_service
  • serving_environment -> environment field in the inference_service

I concur not to derail the convo, the point made by @Al-Pragliola here has merits, in my view.

hah please, I don't want to derail either, but it's too tempting... why not just use the custom properties of the inferenceService, there we can put whatever we want (e.g. namespace/cluster/location)?

Al-Pragliola avatar May 06 '25 17:05 Al-Pragliola

At least with MLMD we couldn't query based on it;

rareddy avatar May 06 '25 20:05 rareddy

@tarilabs @Al-Pragliola I am back looking at this issue in my Friday afternoon dabbling :) How do you recommend in folding DocArtifact, ModelArtifact to fold into Artifact?

i.e. I see ModelArtifact carries more extended properties on it

  modelFormatName:
    description: Name of the model format.
    type: string
  storageKey:
    description: Storage secret name.
    type: string
  storagePath:
    description: Path for model in storage provided by `storageKey`.
    type: string
  modelFormatVersion:
    description: Version of the model format.
    type: string
  serviceAccountName:
    description: Name of the service account with storage secret.
    type: string
  modelSourceKind:
    type: string
    description: "A string identifier describing the source kind. It differentiates various sources of model artifacts.\nThis identifier should be agreed upon by producers and consumers of source model metadata.\nIt is not an enumeration to keep the source of model metadata open ended. \nE.g. Kubeflow pipelines could use `pipelines` to identify models it produces."
  modelSourceClass:
    type: string
    description: |-
      A subgroup within the source kind. It is a specific sub-component or instance within the source kind.
      E.g. `pipelinerun` for a Kubeflow pipeline run.
  modelSourceGroup:
    type: string
    description: "Unique identifier for a source group for models from source class. \nIt maps to a physical group of source models. \nE.g. a Kubernetes namespace where the pipeline run was executed."
  modelSourceId:
    type: string
    description: |-
      A unique identifier for a source model within kind, class, and group.
      It should be a url friendly string if source supports using URLs to locate source models.
      E.g. a pipeline run ID.
  modelSourceName:
    type: string
    description: "A human-readable name for the source model. \nE.g. `my-project/1`, `ibm-granite/granite-3.1-8b-base:2.1.2`."
  uri:
    description: |-
      The uniform resource identifier of the physical artifact.
      May be empty if there is no physical artifact.
    type: string
  state:
    $ref: "#/components/schemas/ArtifactState"

where some of the properties may belong on the model, some may be identifiers for storage source (ArtifactStore tongue-in-cheek ) etc. Any guidance? Should I pull this into a separate issue?

rareddy avatar Jun 06 '25 22:06 rareddy

(ArtifactStore tongue-in-cheek )

assuming you have ModelVersion 1:N ArtifactStore(s), how would we indicate the one belonging to the model for deployment? Same question assuming the relationship is introduced for Artifact (ie Artifact 1:N ArtifactStore(s)).

Any guidance?

Common

  • state common status flag
  • uri considering MR is a metadata store, having the "pointer" from the Artifact to the actual resolvable location is needed. This is also needed for Deployment

Deployment

  • storageKey, storagePath are modeled after KServe StorageSpec , we use path for S3.
  • serviceAccountName is modeled after KServe (in predictor podspec) , afaik we don't use it atm but I'd keep for consistency
  • modelFormatName, modelFormatVersion are modeled after Kserve ModelFormat and we use them

so we use these fields not only to indicate the storage credentials, but specifically to record meaningful metadata for a successful KServe deployment (Isvc).

Origination

  • all modelSource* fields are used to indicate the "actor" which has produced the model, for example KFP (we have 1 use case already with DSP-ilab) and upcoming async K8s Job.

Conclusions

If the question is to simplify on the api Artifact taxonomy:

  • we could have Artifact per this proposal https://github.com/kubeflow/model-registry/issues/1055#issuecomment-2854484478 be modeled with these fields, where at runtime only in the case of "model" are getting populated (reminder: by the DB schema, it's not "wasted space" since properties are not part of the Artifact table but entries in ArtifactProperty table, so to avoid potential unfounded worries...).
  • further, have a "discriminator" field in Artifact, so the user can indicate which kind of Artifact it is (Model, Doc, Code, Data, Eval, etc). This would simplify the api taxonomy without having a prescribed ModelArtifact+DocArtifact+Artifact

Alternatively, if you want to keep only ModelArtifact with these fields:

  • consider removal of DocArtifact which today is not really used
  • rely on a discriminator field in the generic Artifact, to be used for anything which is not a ModelArtifact

@rareddy I would like to highlight your comment also bring up the point about having multiple Artifacts per ModelVersion, which is currently supported in backend, but we are not really from UI.

tarilabs avatar Jun 08 '25 06:06 tarilabs

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Sep 07 '25 04:09 github-actions[bot]

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

github-actions[bot] avatar Sep 28 '25 04:09 github-actions[bot]