model-registry icon indicating copy to clipboard operation
model-registry copied to clipboard

Remove the ML-Metadata Dependency from the Model Registry

Open rareddy opened this issue 1 year ago • 9 comments

I propose that we remove the ml-metadata component from the Model Registry offering.

When the Model Registry project began in the Kubeflow community, our goal was to contribute meaningfully and enhance community needs using the ml-metadata project. However, over time, we have consistently encountered challenges in maintaining and supporting this component as project leads —whether due to the difficulty in recruiting C++ contributors or the inability to successfully contribute to the project. Additionally, the needs of the Model Registry appear to be much narrower than what ml-metadata provides. Our initial intent was also to ensure seamless integration with the KFP component, but that has also not materialized. We still would like to pursue this, but we hoping there may be a better way to do this integration with more community collaboration with KFP component team.

Describe the solution you'd like

I propose implementing a Data API layer over the MLMD server to ensure that there are no changes to the end-user API layer as an initial step. Following this, we can introduce a database (starting with the existing MySQL instance) and implement the persistence layer directly within the Model Registry server using the defined Data API layer.

Since the Model Registry does not expose any gRPC API, this transition will not impact existing Model Registry clients.

Additionally, we should retain the MLMD offering as an optional configuration for those who wish to continue using it in its current form. However, we plan to withdraw official support in the next release.

Describe alternatives you've considered

During the project's conceptualization, there was an initial plan to rewrite ml-metadata in Golang. However, in retrospect, this approach is not viable, as it would require significant resources and time without delivering substantial benefits.

Additional context

Since the Model Registry REST API server is implemented in Golang, we should identify a suitable GORM tool that supports migration tools such as Flyway or golang-migrate for implementing the Data API layer.

To facilitate the transition, we can initially reuse the existing database schema from the ml-metadata database instance. If feasible, we should explore options to reverse-engineer and generate the necessary GORM code to minimize development effort.

Maintaining the existing schema at the outset will also allow the Model Registry to extend its functionality for future integrations, such as with Katib.

rareddy avatar Mar 11 '25 21:03 rareddy

I think this would also facilitate an "inline use-case" that emerges in some contexts, such as integrations with llama-stack. For example, currently leveraging ml-metadata on Apple Silicon is not possible, and this gives us an opportunity to consider/reconsider:

  • Model Registry (server) to simply expose REST API via FastAPI (ie implementing it in Python) so that's "natively available"
  • in 2025, we can consider making a deployment available via docker/podman, rather than embedding a binary executable in the Wheel, so that the MR py client locally interface with when the strategy is "inline"

tarilabs avatar Mar 12 '25 11:03 tarilabs

Model Registry (server) to simply expose REST API via FastAPI (ie implementing it in Python) so that's "natively available"

There is definitely an advantage in doing this for quick local usage and testing. We should take this task as a separate issue as a follow-up.

rareddy avatar Mar 12 '25 12:03 rareddy

Since the Model Registry REST API server is implemented in Golang, we should identify a suitable GORM tool that supports migration tools such as Flyway or golang-migrate for implementing the Data API layer.

A little off-topic from the main point of the proposal (which sounds great, imo), but it's worth considering some tools other than GORM. sqlc sounds like a good fit.

pboyd avatar Mar 12 '25 14:03 pboyd

sqlc sounds like a good fit.

Nice, +1 did not know about this! my recommendation is to avoid the handwritten code since there are good ORM/code generators to expedite development and also to manage the code easily

rareddy avatar Mar 12 '25 15:03 rareddy

+1, on that detail

  • we don't want to hand-craft code but have some code generation or helper framework of sort
  • must support data migration either as a capability, or integrating with a complementary data migration toolkit
  • must support bespoke queries since we'll need to expand on this area

my2c

tarilabs avatar Mar 13 '25 07:03 tarilabs

successfully contribute to the project

ratio% can be seen in this query, for one example:

  • https://github.com/google/ml-metadata/issues?q=involves%3Atarilabs%20OR%20involves%3Adhirajsb%20OR%20involves%3Alampajr%20OR%20involves%3Aisinyaaa%20

tarilabs avatar Mar 13 '25 08:03 tarilabs

We don't need to reinvent the wheel for this. The original MR team (including @tarilabs ) was already pretty far down the road for replacing mlmd back in Sep 2023.

If you look at the git history around that time, you'll find a complete Gorm type model schema and a DB interface layer, along with a yaml driven code generation layer that generated goverters to be used in the REST api as well.

We'd save several weeks of work if we use that as a starting point. Let me know if someone would like to take the initiative to look at the code history point that had most of what we'd need. I'd be happy to help.

dhirajsb avatar Apr 01 '25 05:04 dhirajsb

I've added a new issue that tracks the work done towards removing MLMD #1017

Al-Pragliola avatar Apr 24 '25 08:04 Al-Pragliola

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 24 '25 04:07 github-actions[bot]

this is being actively being worked on by @Al-Pragliola

tarilabs avatar Jul 24 '25 06:07 tarilabs

we can consider this fully completed with:

  • https://github.com/kubeflow/model-registry/issues/1017
  • https://github.com/kubeflow/model-registry/releases/tag/v0.3.0

tarilabs avatar Sep 02 '25 08:09 tarilabs