azure-cli-extensions
azure-cli-extensions copied to clipboard
ModuleNotFoundError when deploying Azure ML model to online endpoint
Describe the bug
I am trying to deploy my Azure ML model to an online endpoint hosted on Azure, as described in the documentation https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=azure-cli. However, I'm failing to import one of my files that should be included in the code repository where the scoring script is located. The logic for inferencing is quite complicated, more so than in the above linked documentation, so additional files must be included apart from just the scoring script.
Related command
az ml online-deployment create -f azure/deployment.yml --resource-group <omitted> --workspace-name <omitted>
I have deployed the exact same scoring script, and everything else the same, locally and it works perfectly fine. The issue is only when trying to deploy to my Azure endpoint.
Errors
The deployment fails with ResourceNotReady. Upon inspection of the logs in Azure Studio, the error is:
2023-09-29 19:00:24,282 E [65] azmlinfsrv - Traceback (most recent call last):
File "/azureml-envs/azureml_de0eccae1d7809be16f4cee8f4e60c52/lib/python3.10/site-packages/azureml_inference_server_http/server/user_script.py", line 77, in load_script
main_module_spec.loader.exec_module(user_module)
File "
The directory structure is as follows:
root:
- __init__.py
- scoring_script.py
- utils/
- __init__.py
- post_processing.py
- models/
- __init__.py
- StrengthNet.py
- RateNet.py
- azure/
- deployment.yml
Issue script & Debug output
My scoring script is as follows:
import json
import pandas as pd
import torch
from models.RateNet import RateNet
from models.StrengthNet import StrengthNet
from utils.post_processing import post_process
strength_net = None
rate_net = None
def init():
global config
global strength_net
global rate_net
# Initialization
# Omitted for business sensitivity
def predict(data, metadata):
# Omitted ...
return predictions
def run(request):
request = json.loads(request)
data = torch.tensor(request["data"])
metadata = torch.tensor(request["metadata"])
# Omitted
predictions = predict(data, metadata)
return predictions.tolist()
Note that the first two imports work, which are also local files. The program only crashes on the utils package.
Deployment YAML file:
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: deployment-name-omitted
endpoint_name: endpoint-name-omitted
model: azureml:ModelNameOmitted:2
code_configuration:
code: ../
scoring_script: scoring_script.py
environment: azureml:env-name-omitted:14
instance_type: Standard_DS4_v2
instance_count: 1
Expected behavior
The expected behaviour is that the deployment is successful, and that the utils.post_processing file is successfully imported. When deployed locally, with the --local tag, things work fine. Additionally, the deployment successfully imports from files in the models directory, so there should logically not be any difference to the utils directory.
Environment Summary
azure-cli 2
Additional context
No response