yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

[BUG]mlflow gc command raise exception when serve-artifacts as local file

Open trillionmonster opened this issue 2 years ago • 14 comments

Willingness to contribute

Yes. I can contribute a fix for this bug independently.

System information

  • Have I written custom code (as opposed to using a stock example script provided in MLflow): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04
  • MLflow installed from (source or binary): binary
  • MLflow version (run mlflow --version): 1.25.1
  • Python version: 3.9
  • npm version, if running the dev UI: None

Describe the problem

mlflow gc command raise exception when serve-artifacts as local file

my solution is add code to mlflow/mlflow/store/artifact/mlflow_artifacts_repo.py line 61

        # if uri is file, return the artifacts under "./mlartifacts", same fold with mlruns
        if track_parse.scheme == "file":

            return os.path.join(os.path.dirname(track_parse.path),"mlartifacts",uri_parse.path[1:])

Tracking information

No response

Code to reproduce issue

cd ./mlflow-root
mlflow server -h 0.0.0.0 -p 18888 --serve-artifacts

after delete mlflow-runs and under mlflow-root then use

mlflow gc

Other info / logs

Traceback (most recent call last):
  File "/opt/anaconda3/envs/mlflow/bin/mlflow", line 8, in <module>
    sys.exit(cli())
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/cli.py", line 489, in gc
    artifact_repo = get_artifact_repository(run.info.artifact_uri)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 107, in get_artifact_repository
    return _artifact_repository_registry.get_artifact_repository(artifact_uri)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 73, in get_artifact_repository
    return repository(artifact_uri)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 46, in __init__
    super().__init__(self.resolve_uri(artifact_uri, get_tracking_uri()))
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 61, in resolve_uri
    _validate_uri_scheme(track_parse.scheme)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 35, in _validate_uri_scheme
    raise MlflowException(
mlflow.exceptions.MlflowException: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'https', 'http'}

What component(s) does this bug affect?

  • [X] area/artifacts: Artifact stores and artifact logging
  • [ ] area/build: Build and test infrastructure for MLflow
  • [ ] area/docs: MLflow documentation pages
  • [ ] area/examples: Example code
  • [ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [ ] area/models: MLmodel format, model serialization/deserialization, flavors
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • [ ] area/server-infra: MLflow Tracking server backend
  • [ ] area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • [ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • [ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

What language(s) does this bug affect?

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients
  • [ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations
  • [ ] integrations/databricks: Databricks integrations

trillionmonster avatar May 12 '22 03:05 trillionmonster

@BenWilson2 @harupy Can you take a look at this?

dbczumar avatar May 16 '22 21:05 dbczumar

Hi @trillionmonster , could you verify these attempts at reproducing this failure and let me know if I'm missing the plot here?

Trial #1:

terminal1: cd ~ mlflow server -h 0.0.0.0 -p 8889 --serve-artifacts

script:

import mlflow
from mlflow.tracking import MlflowClient
from random import random, randint
from sklearn.ensemble import RandomForestRegressor

mlflow.set_tracking_uri("http://0.0.0.0:8889")

client = MlflowClient()
client.list_experiments()
>> [<Experiment: artifact_location='./mlruns/0', experiment_id='0', lifecycle_stage='active', name='Default', tags={}>]

mlflow.set_experiment("gc_test")

run_name = "gc_test_1"

with mlflow.start_run(run_name=run_name) as run:
    print(f"Artifact uri: {run.info.artifact_uri}\n")
    params = {"n_estimators": 5, "random_state": 42}
    model = RandomForestRegressor(**params)
    
    mlflow.log_params(params)
    mlflow.log_param("param_1", randint(0, 100))
    mlflow.log_metrics({"metric_1": random(), "metric_2": random() + 1})

    mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path=f"{run_name}/model"
    )
    print(f"\nRun info: {run.info}")

# Execute above twice
	
client.list_experiments()
>> [<Experiment: artifact_location='./mlruns/0', experiment_id='0', lifecycle_stage='active', name='Default', tags={}>,
 <Experiment: artifact_location='mlflow-artifacts:/1', experiment_id='1', lifecycle_stage='active', name='gc_test', tags={}>]

exp2 = client.list_run_infos(experiment_id=1)
exp2
>> [<RunInfo: artifact_uri='mlflow-artifacts:/1/4152b9895f8045e0af602b2e7bb62684/artifacts', end_time=1652744412469, experiment_id='1', lifecycle_stage='active', run_id='4152b9895f8045e0af602b2e7bb62684', run_uuid='4152b9895f8045e0af602b2e7bb62684', start_time=1652744411081, status='FINISHED', user_id='benjamin.wilson'>,
 <RunInfo: artifact_uri='mlflow-artifacts:/1/b17935eb8098467bbe4accd21afdcddd/artifacts', end_time=1652744403790, experiment_id='1', lifecycle_stage='active', run_id='b17935eb8098467bbe4accd21afdcddd', run_uuid='b17935eb8098467bbe4accd21afdcddd', start_time=1652744402413, status='FINISHED', user_id='benjamin.wilson'>]

mlflow.delete_run(exp2[0].run_id)

exp3 = client.list_run_infos(experiment_id=1)
exp3
>> [<RunInfo: artifact_uri='mlflow-artifacts:/1/b17935eb8098467bbe4accd21afdcddd/artifacts', end_time=1652744403790, experiment_id='1', lifecycle_stage='active', run_id='b17935eb8098467bbe4accd21afdcddd', run_uuid='b17935eb8098467bbe4accd21afdcddd', start_time=1652744402413, status='FINISHED', user_id='benjamin.wilson'>]

mlflow.delete_run(exp3[0].run_id)

exp3 = client.list_run_infos(experiment_id=1)
exp3
>> []

term2:

cd ~/mlruns
mlflow gc

cd ~/mlartifacts
mlflow gc

No exception is thrown.

Trial #2

  1. Start server
    1. mlflow server -h 0.0.0.0 -p 8889 --serve-artifacts
  2. Log 3 runs to new experiment
  3. perform cli operations as below:
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ ls
Downloads                Movies                     repos
Applications             JupyterNotebooks         Music                    miniconda3               universe
Desktop                  Library                  Pictures                 miniforge3              
Documents                Public                   mlruns
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ ls
Downloads                Movies                     mlruns
Applications             JupyterNotebooks         Music                    miniconda3               repos
Desktop                  Library                  Pictures                 miniforge3              
Documents                Public                   mlartifacts             
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ rm -rf mlruns
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ ls
Downloads                Movies                    repos
Applications             JupyterNotebooks         Music                    miniconda3              
Desktop                  Library                  Pictures                 miniforge3              
Documents                Public                   mlartifacts
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ mlflow gc
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env took 2s
➜ ls
Downloads                Movies                     mlruns
Applications             JupyterNotebooks         Music                    miniconda3               repos
Desktop                  Library                  Pictures                 miniforge3              
Documents                Public                   mlartifacts              
(mlflow-dev-env)

Trial #3

  1. Start server
    1. mlflow server -h 0.0.0.0 -p 8889 --serve-artifacts
  2. Perform cli operations as below:
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ ls -l
total 56480
drwx------@   4 benjamin.wilson  staff       128 Apr 25 11:35 Applications
drwx------@   7 benjamin.wilson  staff       224 May  9 19:22 Desktop
drwx------@   3 benjamin.wilson  staff        96 Apr 25 09:47 Documents
drwx------@   6 benjamin.wilson  staff       192 May 13 14:23 Downloads
drwxr-xr-x    9 benjamin.wilson  staff       288 May 16 19:46 JupyterNotebooks
drwx------@  77 benjamin.wilson  staff      2464 May 12 14:56 Library
drwx------    3 benjamin.wilson  staff        96 Apr 25 09:47 Movies
drwx------+   4 benjamin.wilson  staff       128 Apr 25 10:37 Music
drwx------+   4 benjamin.wilson  staff       128 Apr 25 10:37 Pictures
drwxr-xr-x+   4 benjamin.wilson  staff       128 Apr 25 09:47 Public
drwxr-xr-x   15 benjamin.wilson  staff       480 May 12 12:37 miniconda3
drwxr-xr-x   16 benjamin.wilson  staff       512 May  5 13:04 miniforge3
drwxr-xr-x    9 benjamin.wilson  staff       288 May  3 20:44 repos
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ ls -l
total 56480
drwx------@   4 benjamin.wilson  staff       128 Apr 25 11:35 Applications
drwx------@   7 benjamin.wilson  staff       224 May  9 19:22 Desktop
drwx------@   3 benjamin.wilson  staff        96 Apr 25 09:47 Documents
drwx------@   6 benjamin.wilson  staff       192 May 13 14:23 Downloads
drwxr-xr-x    9 benjamin.wilson  staff       288 May 16 19:46 JupyterNotebooks
drwx------@  77 benjamin.wilson  staff      2464 May 12 14:56 Library
drwx------    3 benjamin.wilson  staff        96 Apr 25 09:47 Movies
drwx------+   4 benjamin.wilson  staff       128 Apr 25 10:37 Music
drwx------+   4 benjamin.wilson  staff       128 Apr 25 10:37 Pictures
drwxr-xr-x+   4 benjamin.wilson  staff       128 Apr 25 09:47 Public
drwxr-xr-x   15 benjamin.wilson  staff       480 May 12 12:37 miniconda3
drwxr-xr-x   16 benjamin.wilson  staff       512 May  5 13:04 miniforge3
**drwxr-xr-x    4 benjamin.wilson  staff       128 May 16 19:57 mlruns**
drwxr-xr-x    9 benjamin.wilson  staff       288 May  3 20:44 repos
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ rm -rf  mlruns
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ ls -l
total 56480
drwx------@   4 benjamin.wilson  staff       128 Apr 25 11:35 Applications
drwx------@   7 benjamin.wilson  staff       224 May  9 19:22 Desktop
drwx------@   3 benjamin.wilson  staff        96 Apr 25 09:47 Documents
drwx------@   6 benjamin.wilson  staff       192 May 13 14:23 Downloads
drwxr-xr-x    9 benjamin.wilson  staff       288 May 16 19:46 JupyterNotebooks
drwx------@  77 benjamin.wilson  staff      2464 May 12 14:56 Library
drwx------    3 benjamin.wilson  staff        96 Apr 25 09:47 Movies
drwx------+   4 benjamin.wilson  staff       128 Apr 25 10:37 Music
drwx------+   4 benjamin.wilson  staff       128 Apr 25 10:37 Pictures
drwxr-xr-x+   4 benjamin.wilson  staff       128 Apr 25 09:47 Public
drwxr-xr-x   15 benjamin.wilson  staff       480 May 12 12:37 miniconda3
drwxr-xr-x   16 benjamin.wilson  staff       512 May  5 13:04 miniforge3
drwxr-xr-x    9 benjamin.wilson  staff       288 May  3 20:44 repos
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ mlflow gc
(mlflow-dev-env)
~ via 🅒 mlflow-dev-env via 🐍 dev-env
➜ ls -l
total 56480
drwx------@   4 benjamin.wilson  staff       128 Apr 25 11:35 Applications
drwx------@   7 benjamin.wilson  staff       224 May  9 19:22 Desktop
drwx------@   3 benjamin.wilson  staff        96 Apr 25 09:47 Documents
drwx------@   6 benjamin.wilson  staff       192 May 13 14:23 Downloads
drwxr-xr-x    9 benjamin.wilson  staff       288 May 16 19:46 JupyterNotebooks
drwx------@  77 benjamin.wilson  staff      2464 May 12 14:56 Library
drwx------    3 benjamin.wilson  staff        96 Apr 25 09:47 Movies
drwx------+   4 benjamin.wilson  staff       128 Apr 25 10:37 Music
drwx------+   4 benjamin.wilson  staff       128 Apr 25 10:37 Pictures
drwxr-xr-x+   4 benjamin.wilson  staff       128 Apr 25 09:47 Public
drwxr-xr-x   15 benjamin.wilson  staff       480 May 12 12:37 miniconda3
drwxr-xr-x   16 benjamin.wilson  staff       512 May  5 13:04 miniforge3
**drwxr-xr-x    4 benjamin.wilson  staff       128 May 16 19:58 mlruns**
drwxr-xr-x    9 benjamin.wilson  staff       288 May  3 20:44 repos
(mlflow-dev-env)

BenWilson2 avatar May 17 '22 00:05 BenWilson2

I'm having the same issue as @trillionmonster. This is my code to reproduce:

Launch server:

mlflow server --backend-store-uri sqlite:///C:/mlflow/mlflow.db --artifacts-destination file:///C:/mlflow --serve-artifacts --host 0.0.0.0 -p 5000

Call gc:

mlflow gc --backend-store-uri sqlite:///C:/mlflow/mlflow.db

I'm running on Windows 10, Python 3.8.10 and mlflow 1.26.1

DraXus avatar Jul 06 '22 13:07 DraXus

Had the same error. Was able to fix this by calling mlflow.set_tracking_uri(tracking_uri) before any call to mlflow API on client side (error caused by mlflow call to get_tracking_uri() and getting default tracking_uri, which has file schema):

tracking_uri = os.environ["MLFLOW_TRACKING_URL"]
mlflow.set_tracking_uri(tracking_uri)
logger = MLFlowLogger(
    run_name=run_name,
    experiment_name=experiment_name, 
    tracking_uri=tracking_uri
)  

vanIvan avatar Jul 12 '22 15:07 vanIvan

I encountered the same issue. Make sure your MLFLOW_TRACKING_URI environment variable is set and pointing to your tracking server, that solve the issue in my case

okoben avatar Jul 19 '22 16:07 okoben

Thanks @okoben, setting MLFLOW_TRACKING_URI also fixed my issue.

DraXus avatar Aug 24 '22 12:08 DraXus

@okoben's solution works, but it feels very much that an explicit extra argument is needed for mlflow gc in addition to the --backend-store-uri one

asolimando avatar May 24 '23 09:05 asolimando

Thanks @okoben. It worked.

Set MLFLOW_TRACKING_URI with your MLflow tracking URL. In Linux do, export MLFLOW_TRACKING_URI=[https|http]://<your-url>

karanpathak avatar Sep 08 '23 06:09 karanpathak

Had the same error. Was able to fix this by calling mlflow.set_tracking_uri(tracking_uri) before any call to mlflow API on client side (error caused by mlflow call to get_tracking_uri() and getting default tracking_uri, which has file schema):

tracking_uri = os.environ["MLFLOW_TRACKING_URL"]
mlflow.set_tracking_uri(tracking_uri)
logger = MLFlowLogger(
    run_name=run_name,
    experiment_name=experiment_name, 
    tracking_uri=tracking_uri
)  

Hi, sorry but i don't understand how this fixes the problem with the mlflow gc --backend-store-uri sqlite:///C:/mlflow/mlflow.db. In my case I have a backend store in a RDS and an artifact store in a s3 bucket and I can't perform the mlflow gc

RodrigoCasarCQ avatar Sep 19 '23 13:09 RodrigoCasarCQ

@RodrigoCasarCQ what worked for me in order to have gc running correctly was to add MLFLOW_TRACKING_URI environment variable in my k8s deployment, as:

env:
        - name: MLFLOW_TRACKING_URI
          value: $YOUR_URI

asolimando avatar Sep 19 '23 14:09 asolimando

Finally!! It does work.

But another error raised. I can not delete some of the experiments due to this error to this command mlflow gc --backend-store-uri sqlite:///C:/mlflow/mlflow.db:

raise MlflowException(message=e, error_code=BAD_REQUEST) mlflow.exceptions.MlflowException: (psycopg2.errors.ForeignKeyViolation) update or delete on table "experiments" violates foreign key constraint "datasets_experiment_id_fkey" on table "datasets" DETAIL: Key (experiment_id)=(26) is still referenced from table "datasets".

It seems to be a bug on the new "datasets" table.

RodrigoCasarCQ avatar Sep 19 '23 15:09 RodrigoCasarCQ

Glad to hear! So, was it the MLFLOW_TRACKING_URI environment variable missing? (so it's clear for people down the line if that's all you needed to complete your setup, like in my case).

I am not an Mlflow expert but it seems like the operations in the gc transaction are not properly ordered, it seems a bug. I suggest to open a separate issue for that.

asolimando avatar Sep 19 '23 15:09 asolimando

Yeah it was! In my case I was trying to use mlflow gc inside the ec2 that acts as a server, setting MLFLOW_TRACKING_URI as my ec2 url. Instead the correct use is to set MLFLOW_TRACKING_URI as the local host:

export MLFLOW_TRACKING_URI=http://0.0.0.0:5000

RodrigoCasarCQ avatar Sep 19 '23 15:09 RodrigoCasarCQ

Hey I am getting this error when i run my Model evaluation pipeline ERROR: main: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'http', 'https'}] I am seeking someone could help me to clear it i am using mlflow version 2.2.2

kavearaasane avatar Dec 29 '23 16:12 kavearaasane