yocto-gl [FR] Move runs between experiments

Willingness to contribute

Yes. I can contribute this feature independently.

Proposal Summary

Adding functionality to move runs between experiments.

Motivation

This feature was actively requested in #1028.

Details

Once there is a run selected in the experiment view, the "Move" button is shown, which triggers a model with a list of experiments. Once the runs are moved, the experiment view is refreshed to update the list of runs.

The Tracking API is extended to incorporate move_runs method.

What component(s) does this bug affect?

[ ] area/artifacts: Artifact stores and artifact logging
[ ] area/build: Build and test infrastructure for MLflow
[ ] area/docs: MLflow documentation pages
[ ] area/examples: Example code
[ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
[ ] area/models: MLmodel format, model serialization/deserialization, flavors
[ ] area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
[ ] area/projects: MLproject format, project running backends
[ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
[ ] area/server-infra: MLflow Tracking server backend
[X] area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

[ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
[ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
[ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
[ ] area/windows: Windows support

What language(s) does this bug affect?

[ ] language/r: R APIs and clients
[ ] language/java: Java APIs and clients
[ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

[ ] integrations/azure: Azure and Azure ML integrations
[ ] integrations/sagemaker: SageMaker integrations
[ ] integrations/databricks: Databricks integrations

Dec 03 '22 00:12 starovoitovs

https://github.com/mlflow/mlflow-export-import

This package provides tools to copy MLflow objects (runs, experiments or registered models) from one MLflow tracking server (Databricks workspace) to another.

Dec 03 '22 04:12 amesar

Using this package is quite cumbersome, if one wants to move run from one experiment to another one on the same server. Having a UI would be very intuitive and helpful for many users.

Dec 03 '22 04:12 starovoitovs

@starovoitovs Does the compare-experiments feature satisfy your use case?

Dec 05 '22 01:12 harupy

Not really, I would like to move experiments to another run to group them together in a meaningful manner. For example, after performing a big experiment, I would like to split it in two.

Dec 05 '22 01:12 starovoitovs

+1 for moving a run from one experiment to another!

Dec 13 '22 11:12 LeoLionel

@BenWilson2 @dbczumar @harupy @WeichenXu123 Please assign a maintainer and start triaging this issue.

Dec 16 '22 00:12 mlflow-automation

+1

Feb 03 '23 16:02 ReHoss

+1

Apr 04 '23 06:04 kozachynskyi

+1 for moving a run from one experiment to another!

May 05 '23 15:05 mmavalankar

+1 for moving a run from one experiment to another!

Jul 13 '23 08:07 ASC689561

I think this feature is very helpful, for example, I will choose the best model and move to a specific experiment to keep in a long time. In my situation, I try so many models and there are many waste models, but I just want to keep some best models

Jul 13 '23 08:07 ASC689561

+1 would be very useful

Oct 25 '23 08:10 RobvanGastel

+1, running into this issue as well :)

Oct 25 '23 08:10 Jorissss

I have made this function to do the task. It is basic and I am sure it needs more checks :-) but it worked for me. Then you need to know the mlruns id number of the old and the new experiment.

import shutil
import yaml

def move_mlflow_runs_to_experiment(
    base_mlruns_dir: Path,
    old_experiment_number: int,
    new_experiment_number: int,
) -> None:
    """
    Modify and move MLflow runs from one experiment to another.

    Args:
        base_mlruns_dir (Path): The base directory containing the MLflow runs.
        old_experiment_number (int): The number of the experiment to move runs from.
        new_experiment_number (int): The number of the experiment to move runs to.

    Returns:
        None

    Usage:
        >>> move_mlflow_runs_to_experiment(
        ...     base_mlruns_dir="experiments/mlruns",
        ...     old_experiment_number=29,
        ...     new_experiment_number=28,
        ... )
    """
    old_dir = Path(base_mlruns_dir, str(old_experiment_number))
    new_dir = Path(base_mlruns_dir, str(new_experiment_number))

    # Iterate over the direct subdirectories
    for subdir in old_dir.iterdir():
        if subdir.is_dir():
            # print information about the current directory
            print(f"Processing {subdir}")
            meta_yaml = subdir / "meta.yaml"
            if meta_yaml.exists():
                with meta_yaml.open("r") as file:
                    data = yaml.safe_load(file)

                # Modify the 'artifact_uri' if it exists
                if "artifact_uri" in data:
                    parts = data["artifact_uri"].split("/")
                    for i, part in enumerate(parts):
                        if part == "mlruns":
                            # Update the number after 'mlruns'
                            if i + 1 < len(parts):
                                parts[i + 1] = str(new_experiment_number)
                                break
                    data["artifact_uri"] = "/".join(parts)

                # Update 'experiment_id'
                data["experiment_id"] = str(new_experiment_number)

                # Write the changes back to the file
                with meta_yaml.open("w") as file:
                    yaml.dump(data, file)

            # Move the subdirectory to the new location
            new_subdir_path = new_dir / subdir.name
            if new_subdir_path.exists():
                print(f"Warning: {new_subdir_path} already exists. Skipping move.")
            else:
                shutil.move(str(subdir), str(new_subdir_path))
                print(f"Moved {subdir} to {new_subdir_path}")

Nov 14 '23 08:11 tfha

If you are using a postgress database backend to store the runs, you can use the following function to move a run (with its children):

CREATE OR REPLACE FUNCTION move_run(run_id TEXT, new_experiment_id INT)
RETURNS INT AS $$
DECLARE
    rows_updated INT;
BEGIN
    UPDATE runs
    SET experiment_id = new_experiment_id
    WHERE run_uuid IN (
	SELECT DISTINCT runs.run_uuid
	FROM runs
	JOIN tags ON runs.run_uuid = tags.run_uuid
	WHERE (tags.key = 'mlflow.parentRunId' AND tags.value = run_id) OR runs.run_uuid = run_id
    );
  GET DIAGNOSTICS rows_updated = ROW_COUNT;
  RETURN rows_updated;
END; $$ LANGUAGE plpgsql;

Example:

SELECT move_run('ee9eaf87705c45e686a3cc6a89dfccb5', 4);

Apr 22 '24 16:04 darkopetrovic

ALso looking for this.

Apr 30 '24 14:04 dmenig

yocto-gl yocto-gl copied to clipboard

[FR] Move runs between experiments

Willingness to contribute

Proposal Summary

Motivation

Details

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

yocto-gl
yocto-gl copied to clipboard