model-registry icon indicating copy to clipboard operation
model-registry copied to clipboard

Add Model Catalog API to Model Registry

Open dhirajsb opened this issue 1 year ago • 13 comments

Is your feature request related to a problem? Please describe. Several public ML and LLM model catalogs such as Hugging Face are now available with easily accessible opensource models. At present Kubeflow Model Registry has an API for registering and managing locally trained and published Registered Models. Also, users have to use a variety of different websites, UIs, or APIs to browse and discover foundation models in various catalogs and manually register them for deployment.

Describe the solution you'd like There is a need for a uniform and simple way to access various Model Catalogs hosting foundation models to allow users to easily discover and register models in a Model Registry for local training, enhancement, and serving. The implementation could start simple by allowing users to create a simple curated model catalog source that is backed by a yaml file. The yaml file could contain a list of high level foundation models metadata, some README text, and information about catalog model versions. In the future other catalog source implementations can be created to allow browsing Hugging Face, OpenAPI, etc.

Describe alternatives you've considered As an alternative, a common UI could be built that has adhoc client code to access different catalogs to browse and register models to a Kubeflow Model Registry. However, having a consolidated/common simple backend will make developing such a catalog browsing UI simpler in the future.

Additional context As an example, a simple API that exposes information about models such as the information provided in HuggingFace modelcard, and also supports simple catalog model search by names, tags, etc. would be incredibly useful for Kubeflow users.

dhirajsb avatar Jan 15 '25 07:01 dhirajsb

@andreyvelich @thesuperzapper we're working on a design for a simple model catalog API wrapper for local and remote catalogs like Hugging Face. I'll add more details with diagram, etc. to this feature request tomorrow.

We're hoping it's a valuable addition to Model Registry and helps make the entry point and user experience easier into Kubeflow Model Registry, especially for foundation models and LLMs.

dhirajsb avatar Jan 15 '25 07:01 dhirajsb

A proposed high level architecture would like below.

A configmap can be used to configure a small amount of high level metadata about a collection of catalog sources, such as name, type, description, etc. about the catalog. Every catalog source may also include a secretName if credentials are required to connect to the catalog. E.g. HuggingFace could store API secret/token in a Kubernetes secret.

Model registry would implement the logic to handle different catalog types, and process the catalog source information to fetch the catalog or create a remote connection to it. When Catalog API operations to list CatalogModels and CatalogModelRevisions are received, model registry will query the catalog source implementation logic to fetch the requested information.

Image

dhirajsb avatar Jan 16 '25 06:01 dhirajsb

Here is a draft spec for the model catalog YAML:

$id: https://kubeflow.org/model-registry/catalog.yaml
$schema: https://json-schema.org/draft/2020-12/schema
title: Model Catalog
type: object
properties:
  models:
    type: array
    items:
      type: object
      properties:
        name:
          type: string
          description: Unique name for the model.
          example: ibm-granite/granite-3.1-8b-base
        provider:
          type: string
          example: IBM
        description:
          type: string
          description: Short description of the model.
        readmeLink:
          type: string
          description: URL to a text or markdown file with more information.
        language:
          type: array
          description: List of supported languages (https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes).
          items:
            type: string
          example:
            - en
            - es
            - cz
        license:
          type: string
          description: Short name of the model's license.
          example: apache-2.0
        licenseLink:
          type: string
          description: URL to the license text.
        libraryName:
          type: string
          example: transformers
        baseModel:
          type: array
          description: Reference to the base model (if any).
          items:
            type: object
            properties:
              catalog:
                type: string
                description: |-
                  Name of the catalog for an external base model. Omit for
                  models in the same catalog.
                example: huggingface.io
              repository:
                type: string
                description: |-
                  Name of the repository in an external catalog where the base
                  model exists. Omit for models in the same catalog.
                example: ibm-granite
              model:
                type: string
                example: granite-3.1-8b-base
        tags:
          type: array
          example:
            - language
          items:
            type: string
        tasks:
          type: array
          description: List of tasks the model is designed for.
          items:
            type: string
          example:
            - text-generation
        create:
          description: Creation time in milliseconds since epoch.
          type: integer
        lastUpdateTimeSinceEpoch:
          description: Last update time in milliseconds since epoch.
          type: integer

Here's a quick example for how this will look using a few IBM Granite models from Hugging Face:

models:
- name: ibm-granite/granite-3.1-8b-base
  provider: IBM
  description: A decoder-only code model designed for code generative tasks
  readmeLink: https://huggingface.co/ibm-granite/granite-3.1-8b-base/raw/main/README.md
  language: ["ar", "cs", "de", "en", "es", "fr", "it", "ja", "ko", "nl", "pt", "zh"]
  license: apache-2.0
  licenseLink: https://www.apache.org/licenses/LICENSE-2.0.txt
  libraryName: transformers
  tags:
    - language
    - granite-3.1
  tasks:
    - text-generation
  createTimeSinceEpoch: 1733514949000
  lastUpdateTimeSinceEpoch: 1734637721000
- name: ibm-granite/granite-3.1-8b-instruct
  provider: IBM
  description: A fine-tuned model based on Granite 8B Code Base
  readmeLink: https://huggingface.co/ibm-granite/granite-3.1-8b-instruct/raw/main/README.md
  language: ["ar", "cs", "de", "en", "es", "fr", "it", "ja", "ko", "nl", "pt", "zh"]
  license: apache-2.0
  licenseLink: https://www.apache.org/licenses/LICENSE-2.0.txt
  libraryName: transformers
  baseModel:
    - catalog: huggingface.io
      repository: ibm-granite
      model: granite-3.1-8b-base
  tags:
    - language
    - granite-3.1
  tasks:
    - text-generation
  createTimeSinceEpoch: 1733514949000
  lastUpdateTimeSinceEpoch: 1734637721000

pboyd avatar Jan 21 '25 13:01 pboyd

Based on some feedback, we need some additional information in the catalog:

  • catalog source
  • long description
  • maturity
  • artifacts

Updated schema and example are in this gist: https://gist.github.com/pboyd/278c7b1e9ce0292b82cb871fa7d2103b

pboyd avatar Jan 24 '25 20:01 pboyd

Is there an OpenAPI definition defined for this Catalog model or is it going to use existing Model Registry API?

rareddy avatar Jan 25 '25 00:01 rareddy

The OpenAPI version is here in the initial version PR.

dhirajsb avatar Jan 25 '25 00:01 dhirajsb

Perhaps it makes sense to have the Catalog API exist as a separate process/deployable/container.

To my understanding, the governance aspect to the model-registry is critical in terms of stability/reliability. A consumer of the model-registry depends on the API being available, else there is a possibility of an automated training run not being registered if the registry was down (representing a significant loss in training time and compute cost).

Therefore, I think it makes sense to isolate the load/traffic of catalog operations from the core model-registry. I threw a quick diagram together to help illustrate this architecture. A separate Service would be associated to this component, and therefore separate routes/ingresses/etc can be defined.

The additional benefit is to be able to allow the model-registry itself to become a CatalogSource, which can be queried. This allows a single CatalogAPI instance to be able to reach out to multiple model-registries (in or between clusters), further centralizing "model discovery" within a cluster.

Image

Let me know your thoughts, I don't have strong feelings on this point, so if its a big project to separate the component out, just say so! I figured it's easier to do now rather than later.

Crazyglue avatar Jan 27 '25 16:01 Crazyglue

I think it makes sense to isolate the load/traffic of catalog operations from the core model-registry.

This proposal doesn't take that option away. If the model registry container image supports specifying which service(s) needs to be started, either as a command or command line option, etc. we can configure it to run as a separate deployment and service. If we want to produce a distinct binary and container image, that's a slightly orthogonal conversation.

I don't have a strong preference one way or another how MR and MC services are deployed. It depends on end user's scaling needs. That's why the initial PR started by keeping options open with a single binary and cmdline options.

The additional benefit is to be able to allow the model-registry itself to become a CatalogSource, which can be queried.

If we follow the high level architecture in the proposal, we could simply create a RegistryCatalogImpl that implements ModelCatalogApi that would allow browsing any model registry (not just the co-located registry) as a CatalogSource. The logic in RegistryCatalogImpl would also filter out MR specific metadata, such as owners, state, etc. from the Catalog view.

That way we avoid polluting the MC data model with MR details and vice versa.

I can also imagine that there could be filtering requirements to limit which RegisteredModels are shown in a catalog view of the registry, e.g. only ones marked with a tag like cataloged. A RegistryCatalogImpl could expose configuration for specifying this in the catalog sources yaml config file.

wdyt?

dhirajsb avatar Jan 27 '25 23:01 dhirajsb

We should be careful in displaying the RegisteredModels in the Catalog, as models in MR are under strict RBAC rules. The catalog must use logged-in user credentials to verify their access.

rareddy avatar Jan 28 '25 04:01 rareddy

We should be careful in displaying the RegisteredModels in the Catalog, as models in MR are under strict RBAC rules. The catalog must use logged-in user credentials to verify their access.

Yes, having a separate model registry catalog source would allow us to separate RBAC concerns there to make sure credentials are being forwarded and any filtering is done appropriately.

dhirajsb avatar Jan 28 '25 17:01 dhirajsb

I created a new architecture diagram to better illustrate the proposal to expose registries in catalog service:

Image

dhirajsb avatar Jan 28 '25 17:01 dhirajsb

I'd like to propose a few changes to the design that's presented here.

architecture-beta
    group kfns[Kubeflow Namespace]
    group profilens[Profile Namespace]
    group ext(cloud)[External]

    service mc(server)[Model Catalog] in kfns
    service mr(server)[Model Registry] in profilens
    service csconf(database)[Catalog ConfigMap] in kfns
    service cssecret(database)[Catalog Source Secret] in kfns
    service cs1(server)[Catalog Source] in ext
    service cs2(server)[Catalog Source] in ext

    mr:T -- B:mc
    csconf:R -- L:mc
    cssecret:R -- L:mc
    cs1:L -- R:mc
    cs2:L -- R:mc

I don't believe this is a major departure from @dhirajsb's last diagram, but I would like to call out a few differences:

  • the model catalog is distinct from model registry
    • I would like the catalog to be another binary in the same image (like csi and controller are)
  • the model catalog is in the kubeflow namespace and model registry is in a profile namespace
    • which is to say that model catalog is shared across all profiles
  • model registry calls model catalog, not the other way around

If anyone sees this in the next few hours and would like to discuss it in person, I've added an agenda item for it on today's community call. Or leave a comment here. It would be great to get some more eyes on the proposed architecture.

pboyd avatar May 12 '25 14:05 pboyd

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Sep 08 '25 04:09 github-actions[bot]