yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

[FR] Upload a run from one source (e.g. the local `mlrun` directory) into another tracking server

Open Phylliade opened this issue 4 years ago • 4 comments

Willingness to contribute

The MLflow Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature (either as an MLflow Plugin or an enhancement to the MLflow code base)?

  • [ ] Yes. I can contribute this feature independently.
  • [ ] Yes. I would be willing to contribute this feature with guidance from the MLflow community.
  • [X] No. I cannot contribute this feature at this time.

Proposal Summary

Provide a way to upload a run, already done in the past, and for example stored in the local mlrun directory, into a tracking server.

Motivation

The local mlruns tracking backend is very useful when the machine (let's call it the runner) that does the run is not connected to the internet. This happens a lot in our company, Owkin, where we work with runners in Hospitals' datacenters that are not connected to the internet for security reasons. For the same reasons, HPC machines in our datacenter are not connected to the internet and cannot upload their results to a tracking server. However, in both cases we have the possibility to manually transfer files from the non-internet-connected runner back to a machine with internet access, and do some analytics on the runs.

In this context, it would be very useful to be able to upload the runs created by the runner on a common tracking server, so that results are centralized.

(While in our setup it's mainly an Internet access issue, this issue can also help in other cases, like to do a migration of runs from one tracking server to another, and thus is tightly related to https://github.com/mlflow/mlflow/issues/2382)

What component(s), interfaces, languages, and integrations does this feature affect?

Components

  • [ ] area/artifacts: Artifact stores and artifact logging
  • [ ] area/build: Build and test infrastructure for MLflow
  • [ ] area/docs: MLflow documentation pages
  • [ ] area/examples: Example code
  • [ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [ ] area/models: MLmodel format, model serialization/deserialization, flavors
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: Local serving, model deployment tools, spark UDFs
  • [X] area/tracking: Tracking Service, tracking client APIs, autologging

Interfaces

  • [ ] area/uiux: Front-end, user experience, JavaScript, plotting
  • [ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

Languages

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients

Integrations

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations

Details

A way to do this is to directly copy the run directory from the local mlrun of the runner into the --backend-store-uri dir of the tracking server and then restart the server. However, this is problematic:

  • It requires to restart the server
  • If the tracking server uses a database, this is not possible
  • This does not take into account artifcts

A must would be a command to export/import runs in a serialized format. Something like that:

# On worker withtout internet access
mlflow run export my_run -o my_run.mlflow.zip

# transfer the zip file to a machine with internet access
# On the machine with internet access
mlflow run import -i my_run.mlflow.zip --tracking "http://YOUR-SERVER:4040"
# Now you can see your new run on the tracking server

Phylliade avatar Jun 08 '20 14:06 Phylliade

duplicate? #2512

magnus-m avatar Jun 09 '20 19:06 magnus-m

Seems like #2512 is specific to artifacts, while this issue also encompasses runs?

Phylliade avatar Jun 10 '20 07:06 Phylliade

Seems like #2512 is specific to artifacts, while this issue also encompasses runs?

yeah, you are right : -)

magnus-m avatar Jun 12 '20 21:06 magnus-m

any update on this?

gabrielfu avatar Sep 01 '22 07:09 gabrielfu

I think this can be done using the tools in this repo: https://github.com/mlflow/mlflow-export-import

garymm avatar Nov 01 '22 23:11 garymm

Hello, I just faced the same problem, started to use mlflow using a local mlrun directory because it was easier, accumulated ~ 10000 runs, realized that the UI (and searching the runs in general) was too slow, and decided to use a SQL database. I wanted to migrate all my runs to the new database, but unfortunately mlflow-export-import does not handle the case when migrating from a local storage. Therefore, I have developed some simple code that can migrate at least the experiments and the runs in each experiment to a database. I am sure the code still has several limitations, but it did the job for my user case and perhaps can help anyone facing the same issue. Maybe it can even be a starting point to implement something more official inside the project? Anyway, here is my code: https://github.com/BrunoBelucci/mlflow_util/blob/main/migrate_mlflow_backend.py

BrunoBelucci avatar Sep 28 '23 18:09 BrunoBelucci