yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

[FR] [Roadmap] Publish official Docker image for MLflow Tracking server

Open BenWilson2 opened this issue 2 years ago • 18 comments

MLflow Roadmap Item

This is an MLflow Roadmap item that has been prioritized by the MLflow maintainers. We’ve identified this feature as a highly requested addition to the MLflow package based on community feedback.

Contribution Note

As with other roadmap items, there may be a desire for multiple contributors to work on an issue. While we don’t discourage collaboration, we strongly encourage that a primary contributor is assigned to roadmap issues to simplify the merging process. The items on the roadmap are of a high priority. Due to the wide-spread demand of roadmap features, we encourage potential contributors to only agree to take on the work of creating a PR, making changes, and ensuring that test coverage is adequately created for the feature if they are willing and able to see the implementation through to a merged state.

Feature scope

This roadmap feature’s complexity is classified as:

  • [ ] good-first-issue: This feature is limited in complexity and effort required to implement.
  • [ ] simple: This feature does not require a large amount of effort to implement and / or is clear enough to not need a design discussion with maintainers.
  • [X] involved: This feature will require a substantial amount of development effort but does not require an agreed-upon design from the maintainers. The feedback given during the PR phase may be involved and necessitate multiple iterations before approval. (Please bear with us as we collaborate with you to make a great contribution)
  • [ ] design-recommended: This is a substantial feature that should have a design document approved prior to working on an implementation (to save your time, not ours). After agreeing to work on this feature, a maintainer will be assigned to support you throughout the development process.

Proposal Summary

Build and configure automated versioned release docker images for the MLflow tracking server.

Note: the triggered execution of generating these images and pushing them to a public container repository will need to be handled by MLflow maintainers. If you wish to work on this FR, there will be heavy involvement with us.

Motivation

What is the use case for this feature?

To greatly simplify the process of starting and configuring MLflow for users.

Why is this use case valuable to support for MLflow users in general? ^

What component(s), interfaces, languages, and integrations does this feature affect?

Components

  • [ ] area/artifacts: Artifact stores and artifact logging
  • [X] area/build: Build and test infrastructure for MLflow
  • [X] area/docs: MLflow documentation pages
  • [X] area/examples: Example code
  • [ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [ ] area/models: MLmodel format, model serialization/deserialization, flavors
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • [ ] area/server-infra: MLflow Tracking server backend
  • [ ] area/tracking: Tracking Service, tracking client APIs, autologging

Interfaces

  • [ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • [X] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

Languages

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients
  • [ ] language/new: Proposals for new client languages

Integrations

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations
  • [ ] integrations/databricks: Databricks integrations

BenWilson2 avatar Jun 17 '22 14:06 BenWilson2

@dbczumar I'd like to take a stab at it a few questions:

  1. Is there a base image to inherit from?
  2. From the base image would it effectively be a COPY statement for https://github.com/mlflow/mlflow/tree/master/mlflow/tracking
  3. What would the CMD statement in the Dockerfile be to start the tracking server?
  4. The addition of a workflow to automatically tag + release docker images here - https://github.com/mlflow/mlflow/tree/master/.github/workflows where the docker credentials are taken from GitHub secrets?

oojo12 avatar Aug 25 '22 04:08 oojo12

Hi @oojo12, thank you for volunteering to implement this feature! Please feel free to get started on a pull request and let me know if you have additional questions.

  1. Using FROM ubuntu:20.04 as a base image seems like a great idea.
  2. From the base image, we should pip install MLflow from a specified branch of the MLflow GitHub repository
  3. The CMD for starting the tracking server should be mlflow server, with additional arguments forwarded from the docker run invocation to the mlflow server command.
  4. Spot on! I'm happy to help with this workflow from a credentials / secret management perspective.

dbczumar avatar Aug 28 '22 20:08 dbczumar

As for 1., why not use FROM python:3.x-slim-bullseye (which is based on Debian)? Why build from the ubuntu image? In my experience, the python image has many fewer security issues, is tailored towards python applications (which mlflow is), and still allows using debian packages if need be just like the ubuntu image.

Edit: Also, if you go for ubuntu, please go for 22:04 which is the latest LTS.

martimors avatar Aug 31 '22 10:08 martimors

Would alpine not be better also do we care which python version the image is using?

On Wed, Aug 31, 2022, 6:42 AM Martin Morset @.***> wrote:

As for 1., why not use FROM python:3.x-slim-bullseye (which is based on Debian)? Why build from the ubuntu image? In my experience, the python image has many fewer security issues, is tailored towards python applications (which mlflow is), and still allows using debian packages if need be just like the ubuntu image.

— Reply to this email directly, view it on GitHub https://github.com/mlflow/mlflow/issues/6093#issuecomment-1232771786, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALHYMCWSBVBWUPQP7SEJPMTV34ZIRANCNFSM5ZCQAAAA . You are receiving this because you were mentioned.Message ID: @.***>

oojo12 avatar Aug 31 '22 11:08 oojo12

I'll be starting on this issue today.

On Wed, Aug 31, 2022, 7:27 AM Femi @.***> wrote:

Would alpine not be better also do we care which python version the image is using?

On Wed, Aug 31, 2022, 6:42 AM Martin Morset @.***> wrote:

As for 1., why not use FROM python:3.x-slim-bullseye (which is based on Debian)? Why build from the ubuntu image? In my experience, the python image has many fewer security issues, is tailored towards python applications (which mlflow is), and still allows using debian packages if need be just like the ubuntu image.

— Reply to this email directly, view it on GitHub https://github.com/mlflow/mlflow/issues/6093#issuecomment-1232771786, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALHYMCWSBVBWUPQP7SEJPMTV34ZIRANCNFSM5ZCQAAAA . You are receiving this because you were mentioned.Message ID: @.***>

oojo12 avatar Aug 31 '22 11:08 oojo12

mlflow depends on some very heavy cython-based libaries such as pandas and numpy. In my experience, it is finnicky to run them on alpine without some tricks. Try for example this dockerfile:

FROM python:3.10-alpine
RUN pip install pandas

This will not work, because it is missing gcc. This is just one example where getting this to work on alpine will be "finnicky" (not impossible of course). If we can live with the slim image (based on Debian) I'd say that will save future devs some hedache.

martimors avatar Aug 31 '22 12:08 martimors

@dingobar I think that Debian slim also sounds great! :)

dbczumar avatar Sep 01 '22 16:09 dbczumar

done with the dockerfile portion @dbczumar can you provide an example command so I can validate that this container works as expected?

oojo12 avatar Sep 04 '22 00:09 oojo12

Also, the current work is here if you wanted to collaborate on it.

  1. docker-compose file
  2. dockerfile
  3. workflow - the trigger here is when a new release is published, and the release tag will be taken via the GITHUB_REF env var see misc section below for more details.

Misc information that was combined to do this:

  1. substituting environment vars docker-compose
  2. GITHUB_REF env var workflow
  3. `Action to install docker-compose P.S only works in linux environenments

The above still needs to be tested but this is the current state.

oojo12 avatar Sep 04 '22 01:09 oojo12

Actually, since we are just pip installing from the repo I suppose we can simplify and make a base Mlflow dockerfile, update the entrypoint command in the docker-compose file for all of mlflows services, and reorganize to have a toplevel docker folder with all the necessary files. Let me know your thoughts @dbczumar

oojo12 avatar Sep 04 '22 01:09 oojo12

If we are introducing images to the project now I think we should have a linter for that with a pre-commit. I suggest https://github.com/hadolint/hadolint . It will come with relevant performance and security suggestions.

martimors avatar Sep 04 '22 14:09 martimors

Like the idea. I didn't know linters existed for Dockerfiles.

On Sun, Sep 4, 2022, 10:33 AM Martin Morset @.***> wrote:

If we are introducing images to the project now I think we should have a linter for that with a pre-commit. I suggest https://github.com/hadolint/hadolint . It will come with relevant performance and security suggestions.

— Reply to this email directly, view it on GitHub https://github.com/mlflow/mlflow/issues/6093#issuecomment-1236354766, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALHYMCX65ZRYFAC74CJFVEDV4SXNNANCNFSM5ZCQAAAA . You are receiving this because you were mentioned.Message ID: @.***>

oojo12 avatar Sep 04 '22 15:09 oojo12

@oojo12 This looks great! Can you create a PR for the official dockerfile from https://github.com/oojo12/mlflow/blob/tracking-server-image/mlflow/tracking/Dockerfile? From an example command perspective, I think the existing image is perfect. Users can configure properties of the server using environment variables such as MLFLOW_BACKEND_STORE_URI.

Regarding https://github.com/oojo12/mlflow/blob/tracking-server-image/.github/workflows/push-images.yml, it would be great to publish to GitHub container registry if possible. Is that an easy modification to make?

dbczumar avatar Sep 07 '22 22:09 dbczumar

@dbczumar seems simple enough per this documentation. I will submit two separate PRs. One for the image and one for the workflow.

oojo12 avatar Sep 08 '22 04:09 oojo12

The modular PR's have been made. I suppose we still require an update to documentation mentioning the tracking server Docker image and an example on running the server with it? Also @dingobar were you making an issue for the Dockerfile linter or was the intention for this issue to cover that as well?

oojo12 avatar Sep 08 '22 04:09 oojo12

Assuming everything is satisfactory and gets merged I can also take up #6094 @dbczumar, since it looks stale.

oojo12 avatar Sep 08 '22 05:09 oojo12

A linter is more of a nice-to-have, we can add it later.

martimors avatar Sep 08 '22 06:09 martimors

#6731 #6732 Have been merged

oojo12 avatar Sep 20 '22 11:09 oojo12

Thanks @oojo12 ! :). Images are available here: https://github.com/mlflow/mlflow/pkgs/container/mlflow

dbczumar avatar Jan 09 '23 04:01 dbczumar