yocto-gl
yocto-gl copied to clipboard
JAVA API Update Parameters, Metrics and Artifacts
I am using MLflow (v 1.13.1) on AWS. I set up a tracking server on a Spark EMR master node then access EMR from SageMaker notebook instance by using Livy. Since I am developing models in Scala Spark, so I am using MLflow Java API to track my experiments.
One issue I have is that I need to update the location of model after I found I saved and logged a wrong model. However, I could not find a way in the API doc how to get an existing run and update the value of an existing parameter. Is this supported by MLflow?
Currently I could only create a nested run, and create the same parameter in the nested run with updated value.
@pancodia There is a way to update a parameter for an existing run using the MLflowClient() CRUD.
In Java, you can use the Java MLflowClient().logParam()
Does that answer your question?
Thanks This partially answers my question.
If I need to update a parameter in one of the old runs, how can I get the run (for example by name or run_id) and make it active so that I can update a parameter using MLflowClient().logParam()
?
@pancodia
If I need to update a parameter in one of the old runs, how can I get the run (for example by name or run_id) and make it active so that I can update a parameter using MLflowClient().logParam()?
Yes, you can use MLflowClient().logParam(uid..)
. This is a low-level CRUD API and it won't require you to start a run. To get the run_id, peruse under the MLflow UI of all the runs in the page: from the items in the table showing select the row you want, and it will launch the specific run page, along with all its details, including run_id.
Nice. @dmatrix Thank you for clarification. My question is answered.
Actually one more question about updating an existing parameter.
In my notebook using Spark kernel, I also use MLflow python API to track experiment runs locally (e.g. parameters calculated locally on the SageMaker notebook instance, plots generated in Python locally).
From this SO thread, I learned that using mlflow.log_param()
can update an existing parameter value (same as the Java API).
However, when I try to use it to update a parameter, I got INVALID_PARAMETER_VALUE
error.
mlflow.end_run()
run_id = 'fe03293adb7e4c79a716c11fc938c044'
with mlflow.start_run(run_id=run_id) as run:
mlflow.log_param("test_rmse", 0)
Error message:
RestException: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Param with key='test_rmse' was already logged with value='3.061878' for run ID='fe03293adb7e4c79a716c11fc938c044'. Attempted logging new value '0'.
Is this because MLflow by design does not allow overwrite an existing parameter?
Actually, I logged the parameter by mistake, can I delete an existing parameter?
"Changing param values is not allowed" also raised when I use JAVA API. Anyway to force overwrite?
Any way to rewrite an existing parameter value?
In addition, in my setup, I use a Postgresql database (installed on the same EMR master node) to store the results.
Any way to rewrite an existing parameter value?
Hi, did you find a solution ? :)
Not yet. I am still unable to update the value of an existing parameter. Same for both Python and JAVA API. I am not sure if this is supported by MLflow out of the box. Still waiting for some insider to reply.
@pancodia @jhagege It seems like you cannot update/replace a previously logged parameter with a specific run id. This code snippet shows it.
I suspect this might be by design, as you do not want to taint the state or value of parameters from a previous run; it's a snapshot of that run, with all its dependencies, MLflow entities, etc.
What you could do is create a new run with the new parameter, which is will be distinct from the previously logged run id.
@pancodia @dmatrix Thanks. What I did finally is directly update the value from the SQL.
UPDATE public.params
SET value='****'
WHERE "key" like '%secret%'
@pancodia @jhagege Using backdoor is not something we would want to encourage. :-) But in desperate situations, it may warrant.
Also, if you using a FileStore
then you could just update or modify the corresponding file under the mlruns/exp_id/run_id/params/param_name_file
.
@jhagege Thanks for the suggestion and example.
@dmatrix I agree that we should avoid using backdoor as much as possible. However, sometimes I mistakenly log a wrong value of a parameter (it happens when I am switching between different notebooks). Also I am new to MLflow, sometime I logged an artifact in a way that I was not intended to be. For example, I tried to log "plots/model_diagnostics" folder as an artifact. However, I didn't realize that mlflowClient.logArtifact("plots/model_diagnostics")
only saves the innerst folder instead of retaining the directory structure.
Just a suggestion. Is it possible to have an admin account of MLflow that have overwrite permission?
@pancodia @jhagege
I agree that we should avoid using the backdoor as much as possible. However, sometimes I mistakenly log a wrong value of a parameter (it happens when I am switching between different notebooks). Also, I am new to MLflow, sometime I logged an artifact in a way that I was not intended to be. For example, I tried to log the "plots/model_diagnostics" folder as an artifact. However, I didn't realize that mlflowClient.logArtifact("plots/model_diagnostics") only saves the innerst folder instead of retaining the directory structure.
The idea behind runs and experiments is trials, hence if you make a mistake in a run, within an experiment, then you can start another run with a different set of params, derived metrics, and artifacts to persist. By design, it violates the idea of an experiment run's outcome results being changed, after the fact. All this would violate the principle of experiment governance and the provenance of an experiment run.
Just a suggestion. Is it possible to have an admin account of MLflow that have to overwrite permission?
I don't believe this use case is common enough (changing the metrics or parameters, after the run is finished), to warrant administrative APIs, in IMHO. The idea of run is experimental in its own right, hence experimental runs can always be either discarded or dismissed, as a wrong trial, or they can be re-run, with an alternate or altered set of parameters, to produce different or desired outcomes.
A potential use case is backwards fixing or addition of something that might have happened to warrant a change. It could be that you miss-calculated a metric, recorded a wrong parameter or wanted to change the type of an artifact.
Saying: just rerun the experiment instead of fixing this kind of stuff can come at a high cost that doesn't warrant rerunning, which means that the bottle neck is the lack of the feature in the tool.
As it turns out mlflow.start_run()
does accept run_id
argument to re-open for writing an existing run (at least from v1.13), which can even be used for overwriting (and hopefully, correcting:) existing metrics:
https://stackoverflow.com/a/66909363/9962007
The idea behind runs and experiments is trials, hence if you make a mistake in a run, within an experiment, then you can start another run with a different set of params, derived metrics, and artifacts to persist. By design, it violates the idea of an experiment run's outcome results being changed, after the fact.
I generally agree, but here's a concrete use-case that doesn't quite fit this mold: I'm using autolog, and it logs one of the parameters incorrectly. I'd still like to continue to leverage the convenience of autolog, but I would also (within the same run) like to correct the parameter as a workaround for the value being incorrectly captured by autolog. A keyword argument like overwrite: bool
in the log_param
and log_metric
functions would be more accomodating to pragmatism.
I am facing the same issue - where I logged 100+ runs and one of the parameters was logged wrong and this parameter is used to trigger the downstream task. now I need to recreate these 100+ runs. I understand by design this is not encouraged behavior but at least it should give the user the flexibility to change it.
I think being able to rewrite some parameters is an important feature. Probably it should be enable only trough some additional warnings/flags
I understand the motivation for not providing the ability to overwrite, but surely there should be a method to delete a parameter and re-enter it?
Bringing this back again, are there any developments? I agree with the previous comments, it would be very useful to have a delete and a replace option in the API, for tags, metrics and parameters... Thanks :)