yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

JAVA API Update Parameters, Metrics and Artifacts

Open pancodia opened this issue 4 years ago • 18 comments

I am using MLflow (v 1.13.1) on AWS. I set up a tracking server on a Spark EMR master node then access EMR from SageMaker notebook instance by using Livy. Since I am developing models in Scala Spark, so I am using MLflow Java API to track my experiments.

One issue I have is that I need to update the location of model after I found I saved and logged a wrong model. However, I could not find a way in the API doc how to get an existing run and update the value of an existing parameter. Is this supported by MLflow?

Currently I could only create a nested run, and create the same parameter in the nested run with updated value.

pancodia avatar Jan 19 '21 20:01 pancodia

@pancodia There is a way to update a parameter for an existing run using the MLflowClient() CRUD.

In Java, you can use the Java MLflowClient().logParam()

Does that answer your question?

dmatrix avatar Jan 21 '21 17:01 dmatrix

Thanks This partially answers my question.

If I need to update a parameter in one of the old runs, how can I get the run (for example by name or run_id) and make it active so that I can update a parameter using MLflowClient().logParam()?

pancodia avatar Jan 21 '21 19:01 pancodia

@pancodia

If I need to update a parameter in one of the old runs, how can I get the run (for example by name or run_id) and make it active so that I can update a parameter using MLflowClient().logParam()?

Yes, you can use MLflowClient().logParam(uid..). This is a low-level CRUD API and it won't require you to start a run. To get the run_id, peruse under the MLflow UI of all the runs in the page: from the items in the table showing select the row you want, and it will launch the specific run page, along with all its details, including run_id.

dmatrix avatar Jan 22 '21 18:01 dmatrix

Nice. @dmatrix Thank you for clarification. My question is answered.

pancodia avatar Jan 22 '21 19:01 pancodia

Actually one more question about updating an existing parameter.

In my notebook using Spark kernel, I also use MLflow python API to track experiment runs locally (e.g. parameters calculated locally on the SageMaker notebook instance, plots generated in Python locally).

From this SO thread, I learned that using mlflow.log_param() can update an existing parameter value (same as the Java API).

However, when I try to use it to update a parameter, I got INVALID_PARAMETER_VALUE error.

mlflow.end_run()
run_id = 'fe03293adb7e4c79a716c11fc938c044'
with mlflow.start_run(run_id=run_id) as run:
    mlflow.log_param("test_rmse", 0)

Error message:

RestException: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Param with key='test_rmse' was already logged with value='3.061878' for run ID='fe03293adb7e4c79a716c11fc938c044'. Attempted logging new value '0'.

Is this because MLflow by design does not allow overwrite an existing parameter?

Actually, I logged the parameter by mistake, can I delete an existing parameter?

"Changing param values is not allowed" also raised when I use JAVA API. Anyway to force overwrite?

pancodia avatar Jan 22 '21 20:01 pancodia

Any way to rewrite an existing parameter value?

pancodia avatar Feb 03 '21 00:02 pancodia

In addition, in my setup, I use a Postgresql database (installed on the same EMR master node) to store the results.

pancodia avatar Feb 04 '21 21:02 pancodia

Any way to rewrite an existing parameter value?

Hi, did you find a solution ? :)

jhagege avatar Feb 08 '21 15:02 jhagege

Not yet. I am still unable to update the value of an existing parameter. Same for both Python and JAVA API. I am not sure if this is supported by MLflow out of the box. Still waiting for some insider to reply.

pancodia avatar Feb 08 '21 20:02 pancodia

@pancodia @jhagege It seems like you cannot update/replace a previously logged parameter with a specific run id. This code snippet shows it.

I suspect this might be by design, as you do not want to taint the state or value of parameters from a previous run; it's a snapshot of that run, with all its dependencies, MLflow entities, etc.

What you could do is create a new run with the new parameter, which is will be distinct from the previously logged run id.

dmatrix avatar Feb 08 '21 21:02 dmatrix

@pancodia @dmatrix Thanks. What I did finally is directly update the value from the SQL.

UPDATE public.params
SET value='****'
WHERE "key" like '%secret%'

jhagege avatar Feb 09 '21 15:02 jhagege

@pancodia @jhagege Using backdoor is not something we would want to encourage. :-) But in desperate situations, it may warrant.

Also, if you using a FileStore then you could just update or modify the corresponding file under the mlruns/exp_id/run_id/params/param_name_file.

dmatrix avatar Feb 09 '21 17:02 dmatrix

@jhagege Thanks for the suggestion and example.

@dmatrix I agree that we should avoid using backdoor as much as possible. However, sometimes I mistakenly log a wrong value of a parameter (it happens when I am switching between different notebooks). Also I am new to MLflow, sometime I logged an artifact in a way that I was not intended to be. For example, I tried to log "plots/model_diagnostics" folder as an artifact. However, I didn't realize that mlflowClient.logArtifact("plots/model_diagnostics") only saves the innerst folder instead of retaining the directory structure.

Just a suggestion. Is it possible to have an admin account of MLflow that have overwrite permission?

pancodia avatar Feb 09 '21 18:02 pancodia

@pancodia @jhagege

I agree that we should avoid using the backdoor as much as possible. However, sometimes I mistakenly log a wrong value of a parameter (it happens when I am switching between different notebooks). Also, I am new to MLflow, sometime I logged an artifact in a way that I was not intended to be. For example, I tried to log the "plots/model_diagnostics" folder as an artifact. However, I didn't realize that mlflowClient.logArtifact("plots/model_diagnostics") only saves the innerst folder instead of retaining the directory structure.

The idea behind runs and experiments is trials, hence if you make a mistake in a run, within an experiment, then you can start another run with a different set of params, derived metrics, and artifacts to persist. By design, it violates the idea of an experiment run's outcome results being changed, after the fact. All this would violate the principle of experiment governance and the provenance of an experiment run.

Just a suggestion. Is it possible to have an admin account of MLflow that have to overwrite permission?

I don't believe this use case is common enough (changing the metrics or parameters, after the run is finished), to warrant administrative APIs, in IMHO. The idea of run is experimental in its own right, hence experimental runs can always be either discarded or dismissed, as a wrong trial, or they can be re-run, with an alternate or altered set of parameters, to produce different or desired outcomes.

dmatrix avatar Feb 09 '21 19:02 dmatrix

A potential use case is backwards fixing or addition of something that might have happened to warrant a change. It could be that you miss-calculated a metric, recorded a wrong parameter or wanted to change the type of an artifact.

Saying: just rerun the experiment instead of fixing this kind of stuff can come at a high cost that doesn't warrant rerunning, which means that the bottle neck is the lack of the feature in the tool.

Andrej-Marsic avatar Mar 10 '21 19:03 Andrej-Marsic

As it turns out mlflow.start_run() does accept run_id argument to re-open for writing an existing run (at least from v1.13), which can even be used for overwriting (and hopefully, correcting:) existing metrics: https://stackoverflow.com/a/66909363/9962007

mirekphd avatar Apr 01 '21 08:04 mirekphd

The idea behind runs and experiments is trials, hence if you make a mistake in a run, within an experiment, then you can start another run with a different set of params, derived metrics, and artifacts to persist. By design, it violates the idea of an experiment run's outcome results being changed, after the fact.

I generally agree, but here's a concrete use-case that doesn't quite fit this mold: I'm using autolog, and it logs one of the parameters incorrectly. I'd still like to continue to leverage the convenience of autolog, but I would also (within the same run) like to correct the parameter as a workaround for the value being incorrectly captured by autolog. A keyword argument like overwrite: bool in the log_param and log_metric functions would be more accomodating to pragmatism.

notmatthancock avatar Feb 14 '22 17:02 notmatthancock

I am facing the same issue - where I logged 100+ runs and one of the parameters was logged wrong and this parameter is used to trigger the downstream task. now I need to recreate these 100+ runs. I understand by design this is not encouraged behavior but at least it should give the user the flexibility to change it.

SoulEvill avatar Jul 28 '22 07:07 SoulEvill

I think being able to rewrite some parameters is an important feature. Probably it should be enable only trough some additional warnings/flags

andompesta avatar Aug 24 '23 20:08 andompesta

I understand the motivation for not providing the ability to overwrite, but surely there should be a method to delete a parameter and re-enter it?

NedJWestern avatar Sep 25 '23 10:09 NedJWestern

Bringing this back again, are there any developments? I agree with the previous comments, it would be very useful to have a delete and a replace option in the API, for tags, metrics and parameters... Thanks :)

mloureiro-ilof avatar Apr 04 '24 14:04 mloureiro-ilof