mlflow [FR] [Roadmap] Create a detailed example of creating a custom model flavor

MLflow Roadmap Item

This is an MLflow Roadmap item that has been prioritized by the MLflow maintainers. We’ve identified this feature as a highly requested addition to the MLflow package based on community feedback. We're seeking a community contribution for the implementation of this feature and will enthusiastically support the development and review of a submitted PR for this.

Contribution Note

As with other roadmap items, there may be a desire for multiple contributors to work on an issue. While we don’t discourage collaboration, we strongly encourage that a primary contributor is assigned to roadmap issues to simplify the merging process. The items on the roadmap are of a high priority. Due to the wide-spread demand of roadmap features, we encourage potential contributors to only agree to take on the work of creating a PR, making changes, and ensuring that test coverage is adequately created for the feature if they are willing and able to see the implementation through to a merged state.

Feature scope

This roadmap feature’s complexity is classified as:

[X] good-first-issue: This feature is limited in complexity and effort required to implement.
[ ] simple: This feature does not require a large amount of effort to implement and / or is clear enough to not need a design discussion with maintainers.
[ ] involved: This feature will require a substantial amount of development effort but does not require an agreed-upon design from the maintainers. The feedback given during the PR phase may be involved and necessitate multiple iterations before approval. (Please bear with us as we collaborate with you to make a great contribution)
[ ] design-recommended: This is a substantial feature that should have a design document approved prior to working on an implementation (to save your time, not ours). After agreeing to work on this feature, a maintainer will be assigned to support you throughout the development process.

Proposal Summary

The current example for custom model flavors is inadequate as a guide. This is a common question that gets raised amongst users. This FR is a request to create a much more in-depth example of a custom model flavor that shows its construction in-line within the docs, provides explanations for what is required, and shows usage of it with a screenshot of the model within the UI (namely the serialized model artifact within the run page). Custom flavors should be introduced as separate GitHub repositories with documentation provided in https://mlflow.org/docs/latest/plugins.html#community-plugins.

Motivation

What is the use case for this feature?

Provide an example and better explanation for a common ask that users struggle with.

Why is this use case valuable to support for MLflow users in general?

Many users ask how to do this to incorporate a non-officially supported model flavor or request inclusion of esoteric libraries that will not be considered for official inclusion into MLflow due to low usage.

What component(s), interfaces, languages, and integrations does this feature affect?

Components

[ ] area/artifacts: Artifact stores and artifact logging
[ ] area/build: Build and test infrastructure for MLflow
[X] area/docs: MLflow documentation pages
[X] area/examples: Example code
[ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
[ ] area/models: MLmodel format, model serialization/deserialization, flavors
[ ] area/projects: MLproject format, project running backends
[ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
[ ] area/server-infra: MLflow Tracking server backend
[ ] area/tracking: Tracking Service, tracking client APIs, autologging

Interfaces

[ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
[ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
[ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
[ ] area/windows: Windows support

Languages

[ ] language/r: R APIs and clients
[ ] language/java: Java APIs and clients
[ ] language/new: Proposals for new client languages

Integrations

[ ] integrations/azure: Azure and Azure ML integrations
[ ] integrations/sagemaker: SageMaker integrations
[ ] integrations/databricks: Databricks integrations

Jun 17 '22 20:06 BenWilson2

For any questions, concerns, or clarification on implementing this issue, please ping @WeichenXu123

Jun 21 '22 18:06 BenWilson2

@BenWilson2 I would love to do this task, can you assign the issue to me?

Jun 26 '22 14:06 ikrizanic

Hi @ikrizanic Few days ago Lakshika Parihar ping me and also want to do this task. Let's wait for his response first.

Jun 27 '22 06:06 WeichenXu123

Hey @WeichenXu123 , Yeah i would love to do this task , you can assign the issue to me

Jun 27 '22 18:06 lakshikaparihar

@lakshikaparihar are you still working on this issue??

Jul 05 '22 05:07 AMMAR-62

@AMMAR-62 Yeah, it's assigned to me today only. so I will start working on it.

Jul 05 '22 06:07 lakshikaparihar

Hi @lakshikaparihar, are you still working on this item? If not, @AMMAR-62 , are you interested in contributing it?

Aug 09 '22 17:08 dbczumar

@dbczumar If @lakshikaparihar or @AMMAR-62 won't be able to do it at the moment, I would be happy to contribute

Aug 09 '22 19:08 ikrizanic

@ikrizanic Sounds good! I will assign you to the issue :). @AMMAR-62 @lakshikaparihar , feel free to collaborate with @ikrizanic.

Aug 09 '22 19:08 dbczumar

@dbczumar would be very happy to work on this issue if @ikrizanic cannot devote any capacity at the moment.

I could start dedicating significant capacity beginning of December, but would definitely need some guidance/alignment before starting to contribute

Nov 23 '22 12:11 benjaminbluhm

@benjaminbluhm @dbczumar My apologies, I forgot about this issue and currently don't have any capacity to work on this, so I'll be happy to hand over it to you.

Nov 23 '22 13:11 ikrizanic

@BenWilson2 @dbczumar I would have a few questions to better understand your idea for the custom flavor example to be added to the docs.

Do you suggest we create the custom flavor example using a simple fake model (along the lines of "add n" model in the docs) or would you prefer we create this for a well established ML library?

My thought was to first create the custom flavor example for this FR in a separate github repo and once this is implemented I could start working on the PR for this FR- do you think this strategy makes sense or would you suggest a different route?

And would it make sense to align on a template for custom flavor repo that can be followed by others (scikit learn has project template for scikit-learn compatible extensions so I thought having some kind of template could also help users to create their own mlflow custom flavor repo)? I can look at the custom flavor PRs in mlflow repo but at this moment I am not quite sure what would be a suitable folder and module structure for a custom flavor in a dedicated repo, would be great to get some guidance from you on this aspect.

I might have some more questions but rather wait for your initial feedback.

Thanks a lot!

Nov 23 '22 19:11 benjaminbluhm

Hi @benjaminbluhm I think the best place for it to live for a step-by-step working example would be within the examples section of Mlflow, here: https://github.com/mlflow/mlflow/tree/master/examples . The main crux of this FR isn't just a code example though. It's in building a step-by-step walkthrough of the components, nuances, and 'gotchas' surrounding custom model creation and utilization to support use cases that aren't directly supported natively within named flavors. This could be expanding the functionality of a named flavor (the simpler, the better; no need to try to demonstrate graph embeddings on a TF core model) or showing support for an unsupported but popular modeling framework (if going down this route, I'd check in here first to make sure that it's something that is installable and testable within the examples test suite).

Nov 23 '22 20:11 BenWilson2

Hi @dbczumar @benjaminbluhm @BenWilson2 , sorry forgot about this FR I do have One Usecase in mind i.e "How to do prediction using .predict() for any type of model/framework", lemme know if you want to work on it, would be happy to collaborate.

On Thu, 24 Nov 2022, 01:35 Ben Wilson, @.***> wrote:

Hi @benjaminbluhm https://github.com/benjaminbluhm I think the best place for it to live for a step-by-step working example would be within the examples section of Mlflow, here: https://github.com/mlflow/mlflow/tree/master/examples . The main crux of this FR isn't just a code example though. It's in building a step-by-step walkthrough of the components, nuances, and 'gotchas' surrounding custom model creation and utilization to support use cases that aren't directly supported natively within named flavors. This could be expanding the functionality of a named flavor (the simpler, the better; no need to try to demonstrate graph embeddings on a TF core model) or showing support for an unsupported but popular modeling framework (if going down this route, I'd check in here first to make sure that it's something that is installable and testable within the examples test suite).

— Reply to this email directly, view it on GitHub https://github.com/mlflow/mlflow/issues/6102#issuecomment-1325598964, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHT754ZY7JAHV3SLNU25SDTWJZ2JVANCNFSM5ZDMALZA . You are receiving this because you were mentioned.Message ID: @.***>

Nov 24 '22 05:11 lakshikaparihar

Hi @BenWilson2 thanks for clarification and explaining the scope of having in-depth step-by-step walkthrough. After your feedback yesterday I connected with @aiwalter from sktime team, plan is to create custom model flavor for sktime (see feature request in their repo). I think sktime could be a nice candidate for this FR given it is a very popular time series ML library. In case you like the idea and you can confirm that it is installable and testable within the examples test suite my suggestion would be to first work on custom flavor implementation in sktime and afterwards I could use my insights to work on this FR to provide a detailed step-by-step walkthrough for sktime.

In case you like more the idea from @lakshikaparihar I am also happy to hand over the issue and see if/how I can contribute

Nov 24 '22 06:11 benjaminbluhm

Hi @BenWilson2 @benjaminbluhm, sktime sounds like a great candidate framework. @benjaminbluhm is this still something you'd be interested in contributing?

Jan 09 '23 04:01 dbczumar

Hi @dbczumar for sktime mlflow implementation we have pretty much followed the standard from mlflow built-in model flavors for save_model(), log_model() and load_model() functions (very little modification). The _SktimeModelWrapper class contains the custom inference logic - this is where I could presumably elaborate the most in the step-by-step walk-through.

If this sounds good to you I would be happy to contribute. To my understanding the PR will affect the following components:

[ ] area/docs: MLflow documentation pages
[ ] area/examples: Example code

I guess first step of the PR could be to add the model flavor code in a flavor.py module under mlflow/examples/sktime folder and also include a train.py module similar to existing examples.

Writing the walk-through as part of the docs I would see a second step. Please let me know if you think this plan makes sense or you have a different proposal.

Jan 09 '23 10:01 benjaminbluhm

Hi @benjaminbluhm, apologies for the delay. That plan sounds great!

Jan 31 '23 01:01 dbczumar

Hi @dbczumar, sounds good! Will raise a draft PR in the coming weeks when I have a first version of the custom flavor code ready to get some preliminary feedback

Feb 04 '23 15:02 benjaminbluhm