ml-commons icon indicating copy to clipboard operation
ml-commons copied to clipboard

[BUG] Unable to redeploy a model once undeployed on Windows

Open rbhavna opened this issue 1 year ago • 1 comments

What is the bug? ML Commons: We have been seeing a bug on deploy/undeploy model due to Windows holding folder access while OpenSearch process is running.

  • This is only a Windows isolated issue.
  • It will prevent cleanup during undeploy and re-deployment on the same id.
  • The folder will be released only when you stop the opensearch cluster.

How can one reproduce the bug? Steps to reproduce the behavior: 1) Register a model version

POST /_plugins/_ml/models/_register
{
  "name": "huggingface/sentence-transformers/all-MiniLM-L12-v2",
  "version": "1.0.1",
  "model_format": "TORCH_SCRIPT"
}

## Response
{
  "task_id": "FFe1cIkBT5fLQDwZTj7J",
  "status": "CREATED"
}

2) Get model_id using the task_id above:

GET /_plugins/_ml/tasks/<task_id>

3) Deploy using the model_id

POST /_plugins/_ml/models/<model_id>/_deploy

4) Undeploy it successfully

POST /_plugins/_ml/models/<model_id>/_undeploy

5) Redeploy the same model using the same model ID. The model fails to get deployed

POST /_plugins/_ml/models/<model_id>/_deploy

What is the expected behavior? When a user undeploy a model and tries to re-deploy the same model, it should get deployed successfully.

What is your host/environment?

  • OS: Windows

Do you have any screenshots? If applicable, add screenshots to help explain your problem.

Do you have any additional context? Add any other context about the problem.

rbhavna avatar Jul 20 '23 00:07 rbhavna