mlflow-export-import icon indicating copy to clipboard operation
mlflow-export-import copied to clipboard

Bug found in _extract_model_path function in import_model

Open tanveer-singh-db opened this issue 2 years ago • 2 comments

While trying to extract model path, the code assumes the path to be of the format <base_path>/<source_run_id>/artifacts/model Its returning an incorrect model path for the format <custom_path>_artifacts/<source_run_id>/artifacts/model Ideally it should return the model path after artifacts i.e. model, but because of the presence of artifacts in the custom path, its returning <source_run_id>/artifacts/model

tanveer-singh-db avatar Jun 06 '23 05:06 tanveer-singh-db

Can you give an example of the URL and the context. Not clear what the issue is.

amesar avatar Jun 30 '23 17:06 amesar

I am facing this exact issue. In my case, the source path is mlflow-artifacts:/1/hash/artifacts/model, the _extract_model_path does the following:

def _extract_model_path(source, run_id):
    idx = source.find(run_id)
    if idx == -1:
        raise MlflowExportImportException(f"Cannot find run ID '{run_id}' in registered model version source field '{source}'", http_status_code=404)
    model_path = source[1+idx+len(run_id):]
    pattern = "artifacts"

    idx = source.find(pattern)
    if idx == -1: # Bizarre - sometimes there is no 'artifacts' after run_id
        model_path = ""
    else:
        model_path = source[1+idx+len(pattern):]
    return model_path

Since my source path contains more than one instance of the pattern artifacts the function yields the model path as being /1/hash/artifacts/model, when it fact, it should be model.

This is my model.json file:

{
  "system": {
    "package_version": "1.2.0",
    "script": "export_model.py",
    "export_file_version": "2",
    "export_time": 1710795216,
    "_export_time": "2024-03-18 20:53:36",
    "mlflow_version": "2.11.1",
    "mlflow_tracking_uri": "http://mlflow:5000",
    "platform": {
      "python_version": "3.10.13",
      "system": "Linux",
      "processor": ""
    },
    "user": "root"
  },
  "info": {
    "num_target_stages": 0,
    "num_target_versions": 1,
    "num_src_versions": 1,
    "num_dst_versions": 1,
    "failed_versions": [],
    "export_latest_versions": false,
    "export_permissions": false
  },
  "mlflow": {
    "registered_model": {
      "name": "2024-03-14T13.32.45.375396Z",
      "creation_timestamp": 1710431030741,
      "last_updated_timestamp": 1710431031009,
      "_last_updated_timestamp": "2024-03-14 15:43:51",
      "_creation_timestamp": "2024-03-14 15:43:51",
      "versions": [
        {
          "name": "2024-03-14T13.32.45.375396Z",
          "version": "1",
          "creation_timestamp": 1710431031009,
          "last_updated_timestamp": 1710431031009,
          "description": "",
          "user_id": "",
          "current_stage": "None",
          "source": "mlflow-artifacts:/1/f26b84e23d9e43609ca39838b9179bf8/artifacts/model",
          "run_id": "f26b84e23d9e43609ca39838b9179bf8",
          "run_link": "",
          "status": "READY",
          "status_message": "",
          "tags": {},
          "aliases": [],
          "_run_artifact_uri": "mlflow-artifacts:/1/f26b84e23d9e43609ca39838b9179bf8/artifacts",
          "_experiment_name": "Rupture",
          "_last_updated_timestamp": "2024-03-14 15:43:51",
          "_creation_timestamp": "2024-03-14 15:43:51"
        }
      ]
    }
  }
}

I managed to get the appropriate behavior, in my case, by prefixing the pattern with a forward slash: pattern = "/artifacts". Not sure this is a robust solution though. Perhaps allowing to pass the pattern as a command line option could suit everyone.

guizsantos avatar Mar 26 '24 12:03 guizsantos