model-registry Make CSI depend on the model registry OpenAPI sepc

Is your feature request related to a problem? Please describe. At the moment the CSI heavily relies on the model registry project, this is because the model-registry dependency is imported in order to interact with model registry by making use of the auto-generated and provided client: https://github.com/kubeflow/model-registry/blob/332c0851544c525a68f8a467712b4b1542c25cf6/csi/pkg/storage/modelregistry_provider.go#L22-L27

This implementation could lead to some issues, especially from a maintenance point of view, see https://github.com/kubeflow/model-registry/pull/328#issuecomment-2324712508 as an example.

Describe the solution you'd like Decouple CSI and Model Registry by removing the model-registry dependency from the CSI dependencies.

Then update the CSI to make use of a custom and auto-generated REST client (or whatever could be valuable and easy-to-maintain approach) relying on the versioned model-registry OpenAPI specification.

Describe alternatives you've considered n/a

Additional context If we go for a custom auto-generated client to interact with the MR REST interface consider that we just need a couple of endpoints to invoke so I think it is worth to evaluate whether we can auto-generate just a subset of the OpenAPI spec.

Sep 02 '24 13:09 lampajr

Is there reason why CSI needs to be designed as extension to MR vs have it part of the MR code directly to avoid such issues?

Sep 02 '24 14:09 rareddy

Describe alternatives you've considered

an alternative I didn't have chance to investigate is whether workspaces would make this easier also to be properly managed from dependabot automated toolings.

Sep 02 '24 14:09 tarilabs

Is there reason why CSI needs to be designed as extension to MR vs have it part of the MR code directly to avoid such issues?

@rareddy TBH I've never considered this because I always tried to make CSI as isolated and independent project as possible (at least this was the initial goal) but now that you pointed out again I am starting thinking it is worth to evaluate whether adding another go cmd https://github.com/kubeflow/model-registry/tree/main/cmd just for the CSI. Obviously generating the executable for the CSI will rebuild the model-registry source code as well (but this is actually already happening with the relative import).

Don't know if that could introduce other issues but I think it is worth to investigate.

an alternative I didn't have chance to investigate is whether workspaces would make this easier also to be properly managed from dependabot automated toolings.

Good point, this is another thing I did not try yet too.

Sep 02 '24 15:09 lampajr

As discussed in the last bi-weekly mtg, as part of this issue we should probably evaluate the following options:

decouple the REST part, by relying on the openapi to generate the client
- Maybe Workspaces make it not necessary to “decouple this way”
ad-hoc command inside the existing Model Registry To be noticed here, would require another command (not subcommand)

Sep 02 '24 17:09 lampajr

@lampajr can you explain what is ad-hoc command or sub-command in this context? I do not understand it.

Sep 06 '24 15:09 rareddy

@lampajr can you explain what is ad-hoc command or sub-command in this context? I do not understand it.

With command I mean a dedicated Go script like main.go which then will use the model registry source code directly. The only issue I might see here is that every dependency that is required by CSI needs to be included in the root go.mod

With sub-command I mean another cli command like the one we already have named proxy.

Sep 06 '24 15:09 lampajr

Another difference between opting for command vs subcommand is the resulting artifact produced: in the case of subcommand, that would have been a "subroutine" in the same binary, while for CSI we only need a subset of the whole in the final binary (for CSI). I believe is important to keep in mind/highlight.

Sep 06 '24 15:09 tarilabs

Another difference between opting for command vs subcommand is the resulting artifact produced: in the case of subcommand, that would have been a "subroutine" in the same binary, while for CSI we only need a subset of the whole in the final binary (for CSI). I believe is important to keep in mind/highlight.

Definitely worth to mention it, thanks @tarilabs

Sep 06 '24 15:09 lampajr

Thanks, @tarilabs and @lampajr for the explanation, makes sense.

I guess command is the way we should do where the go.mod is common but different package structures produce multiple binaries. ex: https://stackoverflow.com/questions/50904560/how-to-structure-go-application-to-produce-multiple-binaries.

Sep 06 '24 16:09 rareddy

Yes @rareddy or something along those lines.

However as mentioned I believe Workspaces should be investigated first, if that ameliorates the problem of keeping deps in sync, without actually "mixing code". If that doesn't help, then the "different command(s) from the same go project" is likely the next simpler approach. my2c.

Sep 07 '24 07:09 tarilabs

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Dec 07 '24 04:12 github-actions[bot]

After some investigation it turned out the best approach here is to keep CSI as different executable/command from the same model registry go module such that we don't have to maintain different modules that share part of the codebase. (changed the title of this issue reflecting the final decision)

The proposal can be found here 👉🏻 https://github.com/kubeflow/model-registry/pull/601

@tarilabs I think we can remove the lifecycle/stale here, wdyt?

Dec 10 '24 08:12 lampajr

@tarilabs I think we can remove the lifecycle/stale here, wdyt?

indeed this auto-resolved per https://github.com/kubeflow/model-registry/pull/601 thanks a lot again @lampajr for these contributions!! 🚀

Dec 11 '24 09:12 tarilabs