azure-cli-extensions
azure-cli-extensions copied to clipboard
AzureML CLI v2 - R api - Mlflow causing pipeline to break
- If the issue is to do with Azure CLI 2.0 in-particular, create an issue here at Azure/azure-cli
Related command
az ml job createExtension name (the extension in question)
ml
Description of issue (in as much detail as possible)
When running a ml component running R api mlflow code block using "with statement" block, following documentation example (https://mlflow.org/docs/latest/R-api.html#mlflow-start-run)
The component does not pass output to input of following step correctly, The input is passed as a "DataReference" and does not point to correct location. When removing the "with statement" block the pipeline runs correctly.
ComponentA working:
component.r: library(optparse) library("carrier") library(mlflow)
options <- list( make_option(c("-d", "--data_folder"), default="./data"), make_option(c("-o", "--out_folder"), default="./out") )
opt_parser <- OptionParser(option_list = options) opt <- parse_args(opt_parser)
paste(opt$data_folder) paste(opt$out_folder)
run <- mlflow_start_run() accidents <- readRDS(file.path(opt$data_folder, "accidents.Rd")) summary(accidents)
saveRDS(accidents, file.path(opt$out_folder, "predictions.Rd"))
Pipeline component json, subsequent step: "runDefinition": { "script": null, "command": "Rscript eval_model.r --model $AZUREML_DATAREFERENCE_component_b_input", "useAbsolutePath": false, "arguments": [], "sourceDirectoryDataStore": null, "framework": "Python", "communicator": "None", "target": "cpu-cluster", "dataReferences": {}, "data": {}, "inputAssets": { "component_b_input": { "asset": { "assetId": "azureml://locations/westus2/workspaces/dc720ed6-90b3-49cb-8383-0897f5db4402/data/azureml_3450fa88-128c-4026-9065-a0e214261911_output_data_component_a_output/versions/1", "type": "UriFolder" }, "mechanism": "Mount", "environmentVariableName": "AZURE_ML_INPUT_component_b_input", "pathOnCompute": null, "overwrite": true, "options": { "IsEvalMode": "False", "ReadWrite": "False", "ForceFolder": "False" } } },
ComponentA - results in error: component.r:
library(optparse) library("carrier") library(mlflow)
options <- list( make_option(c("-d", "--data_folder"), default="./data"), make_option(c("-o", "--out_folder"), default="./out") )
opt_parser <- OptionParser(option_list = options) opt <- parse_args(opt_parser)
paste(opt$data_folder) paste(opt$out_folder)
with(run <- mlflow_start_run(), { accidents <- readRDS(file.path(opt$data_folder, "accidents.Rd")) summary(accidents)
saveRDS(accidents, file.path(opt$out_folder, "predictions.Rd")) })
Pipeline component json, subsequent step:
"runDefinition": { "script": null, "command": "Rscript eval_model.r --model $AZUREML_DATAREFERENCE_model", "useAbsolutePath": false, "arguments": [], "sourceDirectoryDataStore": null, "framework": "Python", "communicator": "None", "target": "cpu-cluster", "dataReferences": { "model": { "dataStoreName": "workspaceblobstore", "mode": "Mount", "pathOnDataStore": "azureml/{name}/model_out/", "pathOnCompute": null, "overwrite": true } }, "inputs" : null
steps to reproduce:
- open attached
- create environment from docker file
- uncomment lines /src/accident.R ln:27 and ln:46
- run az ml create job -f pipeline.yml
route to CXP team
@jakeatmsft Apologies for the delay, we are looking into it.
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github.
Issue Details
- If the issue is to do with Azure CLI 2.0 in-particular, create an issue here at Azure/azure-cli
Related command
az ml job createExtension name (the extension in question)
ml
Description of issue (in as much detail as possible)
When running a ml component running R api mlflow code block using "with statement" block, following documentation example (https://mlflow.org/docs/latest/R-api.html#mlflow-start-run)
The component does not pass output to input of following step correctly, The input is passed as a "DataReference" and does not point to correct location. When removing the "with statement" block the pipeline runs correctly.
ComponentA working:
component.r: library(optparse) library("carrier") library(mlflow)
options <- list( make_option(c("-d", "--data_folder"), default="./data"), make_option(c("-o", "--out_folder"), default="./out") )
opt_parser <- OptionParser(option_list = options) opt <- parse_args(opt_parser)
paste(opt$data_folder) paste(opt$out_folder)
run <- mlflow_start_run() accidents <- readRDS(file.path(opt$data_folder, "accidents.Rd")) summary(accidents)
saveRDS(accidents, file.path(opt$out_folder, "predictions.Rd"))
Pipeline component json, subsequent step: "runDefinition": { "script": null, "command": "Rscript eval_model.r --model $AZUREML_DATAREFERENCE_component_b_input", "useAbsolutePath": false, "arguments": [], "sourceDirectoryDataStore": null, "framework": "Python", "communicator": "None", "target": "cpu-cluster", "dataReferences": {}, "data": {}, "inputAssets": { "component_b_input": { "asset": { "assetId": "azureml://locations/westus2/workspaces/dc720ed6-90b3-49cb-8383-0897f5db4402/data/azureml_3450fa88-128c-4026-9065-a0e214261911_output_data_component_a_output/versions/1", "type": "UriFolder" }, "mechanism": "Mount", "environmentVariableName": "AZURE_ML_INPUT_component_b_input", "pathOnCompute": null, "overwrite": true, "options": { "IsEvalMode": "False", "ReadWrite": "False", "ForceFolder": "False" } } },
ComponentA - results in error: component.r:
library(optparse) library("carrier") library(mlflow)
options <- list( make_option(c("-d", "--data_folder"), default="./data"), make_option(c("-o", "--out_folder"), default="./out") )
opt_parser <- OptionParser(option_list = options) opt <- parse_args(opt_parser)
paste(opt$data_folder) paste(opt$out_folder)
with(run <- mlflow_start_run(), { accidents <- readRDS(file.path(opt$data_folder, "accidents.Rd")) summary(accidents)
saveRDS(accidents, file.path(opt$out_folder, "predictions.Rd")) })
Pipeline component json, subsequent step:
"runDefinition": { "script": null, "command": "Rscript eval_model.r --model $AZUREML_DATAREFERENCE_model", "useAbsolutePath": false, "arguments": [], "sourceDirectoryDataStore": null, "framework": "Python", "communicator": "None", "target": "cpu-cluster", "dataReferences": { "model": { "dataStoreName": "workspaceblobstore", "mode": "Mount", "pathOnDataStore": "azureml/{name}/model_out/", "pathOnCompute": null, "overwrite": true } }, "inputs" : null
steps to reproduce:
- open attached
- create environment from docker file
- uncomment lines /src/accident.R ln:27 and ln:46
- run az ml create job -f pipeline.yml
Author: | jakeatmsft |
---|---|
Assignees: | - |
Labels: |
|
Milestone: | - |