[Proposal] Make Openapi(REST) client the only entrypoint
At the moment Model Registry has two client entrypoints in the codebase:
-
pkg/openapi: the auto generated client from the openapi schema -
pkg/core: a grpc client that talks directly with the MLMD service
This creates a strong dependency on the MLMD service and makes difficult changing the underlying service without major breaking changes in the future, also we cannot trace/log/audit what users are doing to the metadata store if they bypass our proxy. My proposal is to move the package from pkg/core to internal/core and do a refactoring to make it a switchable datastore for the model registry. What I mean by switchable datastore is to change the config arguments that we take in the proxy:
- (change) mlmd_hostname -> datastore_hostname
- (change) mlmd_port -> datastore_port
- (create) datastore_service(type?) = mlmd / sqlite / whatever
then in model registry we can use a factory method to use the correct datastore implementation of the interface ModelRegistryServiceAPIServicer
I did a check on which projects import pkg/core and for upstream I didn't find any (https://pkg.go.dev/github.com/kubeflow/model-registry/pkg/core?tab=importedby), but we have a breaking change in the midstream (https://pkg.go.dev/github.com/opendatahub-io/model-registry/pkg/core?tab=importedby) we should make the team that works on opendatahub-io/odh-model-controller aware of this change.
Hey @Al-Pragliola, you have my +1 on this proposal for all the motivation you have already highlighted here!
I think that having a single entrypoint is always the best solution, especially here where we don't know if we will change the underlying "datastore" in the future.
Moreover, as you pointed out, I think this is the right timing to do this change as we don't have many external projects relying on the internal pkg/core.
(change) mlmd_hostname -> datastore_hostname (change) mlmd_port -> datastore_port (create) datastore_service(type?) = mlmd / sqlite / whatever
I agree on this generalization and I also like the datastore name.
then in model registry we can use a factory method to use the correct datastore implementation of the interface ModelRegistryServiceAPIServicer
Yeah that's an option, otherwise we could also have different core services, one per datastore that we could support in the future by keeping a single ModelRegistryServiceAPIServicer implementation and changing the injected core service accordingly.
I did a check on which projects import pkg/core and for upstream I didn't find any (https://pkg.go.dev/github.com/kubeflow/model-registry/pkg/core?tab=importedby)
AFAIK I can confirm this as well
but we have a breaking change in the midstream (https://pkg.go.dev/github.com/opendatahub-io/model-registry/pkg/core?tab=importedby) we should make the team that works on opendatahub-io/odh-model-controller aware of this change.
Yeah, that's the only component that is using the pkg/core directly that I am aware of, shouldn't be a huge effort to switch to the REST client.
Thank you @Al-Pragliola for this proposal, as anticipated I fully agree.
One minor suggestion in your diagram is:
- box currently labeled "External application 2" should actually read "Model Registry plug-ins/extensions"
we indeed want to avoid external developers/users consume an API which currently "leaks" an internal implementation choice (the gRPC connection to MLMD) hence motivating this proposal
Beyond the ModelRegistryServiceAPIServicer, as mentioned the "Core API" interface was designed to meet the MR requirements, even before the REST API layer came to be, and in general I can see an option where we could leverage it but without exposing the internal implementation gRPC connection.
Any comment on this documentation text being added btw?
(create) datastore_service(type?) = mlmd / sqlite / whatever
I do like this abstraction at the ModelRegistryServiceAPIServicer to include other datastores, this way we can propose an alternative to MLMD while we phase this out (although since we are alpha there is no reason support this beyond initial release).
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I believe this is on hold as
- https://github.com/kubeflow/model-registry/pull/425
is holding for the motivations agreed there
/hold