azure-cli-extensions AzureML CLI v2 - R api - Mlflow causing pipeline to break

If the issue is to do with Azure CLI 2.0 in-particular, create an issue here at Azure/azure-cli

Related command

az ml job create

Extension name (the extension in question)

ml

Description of issue (in as much detail as possible)

When running a ml component running R api mlflow code block using "with statement" block, following documentation example (https://mlflow.org/docs/latest/R-api.html#mlflow-start-run)

The component does not pass output to input of following step correctly, The input is passed as a "DataReference" and does not point to correct location. When removing the "with statement" block the pipeline runs correctly.

ComponentA working:

component.r: library(optparse) library("carrier") library(mlflow)

options <- list( make_option(c("-d", "--data_folder"), default="./data"), make_option(c("-o", "--out_folder"), default="./out") )

opt_parser <- OptionParser(option_list = options) opt <- parse_args(opt_parser)

paste(opt$data_folder) paste(opt$out_folder)

run <- mlflow_start_run() accidents <- readRDS(file.path(opt$data_folder, "accidents.Rd")) summary(accidents)

saveRDS(accidents, file.path(opt$out_folder, "predictions.Rd"))

Pipeline component json, subsequent step: "runDefinition": { "script": null, "command": "Rscript eval_model.r --model $AZUREML_DATAREFERENCE_component_b_input", "useAbsolutePath": false, "arguments": [], "sourceDirectoryDataStore": null, "framework": "Python", "communicator": "None", "target": "cpu-cluster", "dataReferences": {}, "data": {}, "inputAssets": { "component_b_input": { "asset": { "assetId": "azureml://locations/westus2/workspaces/dc720ed6-90b3-49cb-8383-0897f5db4402/data/azureml_3450fa88-128c-4026-9065-a0e214261911_output_data_component_a_output/versions/1", "type": "UriFolder" }, "mechanism": "Mount", "environmentVariableName": "AZURE_ML_INPUT_component_b_input", "pathOnCompute": null, "overwrite": true, "options": { "IsEvalMode": "False", "ReadWrite": "False", "ForceFolder": "False" } } },

ComponentA - results in error: component.r:

library(optparse) library("carrier") library(mlflow)

options <- list( make_option(c("-d", "--data_folder"), default="./data"), make_option(c("-o", "--out_folder"), default="./out") )

opt_parser <- OptionParser(option_list = options) opt <- parse_args(opt_parser)

paste(opt$data_folder) paste(opt$out_folder)

with(run <- mlflow_start_run(), { accidents <- readRDS(file.path(opt$data_folder, "accidents.Rd")) summary(accidents)

saveRDS(accidents, file.path(opt$out_folder, "predictions.Rd")) })

Pipeline component json, subsequent step:

"runDefinition": { "script": null, "command": "Rscript eval_model.r --model $AZUREML_DATAREFERENCE_model", "useAbsolutePath": false, "arguments": [], "sourceDirectoryDataStore": null, "framework": "Python", "communicator": "None", "target": "cpu-cluster", "dataReferences": { "model": { "dataStoreName": "workspaceblobstore", "mode": "Mount", "pathOnDataStore": "azureml/{name}/model_out/", "pathOnCompute": null, "overwrite": true } }, "inputs" : null

steps to reproduce:

open attached
create environment from docker file
uncomment lines /src/accident.R ln:27 and ln:46
run az ml create job -f pipeline.yml

accidents-copy.zip l

Aug 12 '22 18:08 jakeatmsft

route to CXP team

Aug 12 '22 22:08 yonzhan

@jakeatmsft Apologies for the delay, we are looking into it.

Oct 19 '22 04:10 RakeshMohanMSFT

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github.

Issue Details

If the issue is to do with Azure CLI 2.0 in-particular, create an issue here at Azure/azure-cli

Related command

az ml job create

Extension name (the extension in question)

ml

Description of issue (in as much detail as possible)

When running a ml component running R api mlflow code block using "with statement" block, following documentation example (https://mlflow.org/docs/latest/R-api.html#mlflow-start-run)

The component does not pass output to input of following step correctly, The input is passed as a "DataReference" and does not point to correct location. When removing the "with statement" block the pipeline runs correctly.

ComponentA working:

component.r: library(optparse) library("carrier") library(mlflow)

options <- list( make_option(c("-d", "--data_folder"), default="./data"), make_option(c("-o", "--out_folder"), default="./out") )

opt_parser <- OptionParser(option_list = options) opt <- parse_args(opt_parser)

paste(opt$data_folder) paste(opt$out_folder)

run <- mlflow_start_run() accidents <- readRDS(file.path(opt$data_folder, "accidents.Rd")) summary(accidents)

saveRDS(accidents, file.path(opt$out_folder, "predictions.Rd"))

Pipeline component json, subsequent step: "runDefinition": { "script": null, "command": "Rscript eval_model.r --model $AZUREML_DATAREFERENCE_component_b_input", "useAbsolutePath": false, "arguments": [], "sourceDirectoryDataStore": null, "framework": "Python", "communicator": "None", "target": "cpu-cluster", "dataReferences": {}, "data": {}, "inputAssets": { "component_b_input": { "asset": { "assetId": "azureml://locations/westus2/workspaces/dc720ed6-90b3-49cb-8383-0897f5db4402/data/azureml_3450fa88-128c-4026-9065-a0e214261911_output_data_component_a_output/versions/1", "type": "UriFolder" }, "mechanism": "Mount", "environmentVariableName": "AZURE_ML_INPUT_component_b_input", "pathOnCompute": null, "overwrite": true, "options": { "IsEvalMode": "False", "ReadWrite": "False", "ForceFolder": "False" } } },

ComponentA - results in error: component.r:

library(optparse) library("carrier") library(mlflow)

options <- list( make_option(c("-d", "--data_folder"), default="./data"), make_option(c("-o", "--out_folder"), default="./out") )

opt_parser <- OptionParser(option_list = options) opt <- parse_args(opt_parser)

paste(opt$data_folder) paste(opt$out_folder)

with(run <- mlflow_start_run(), { accidents <- readRDS(file.path(opt$data_folder, "accidents.Rd")) summary(accidents)

saveRDS(accidents, file.path(opt$out_folder, "predictions.Rd")) })

Pipeline component json, subsequent step:

"runDefinition": { "script": null, "command": "Rscript eval_model.r --model $AZUREML_DATAREFERENCE_model", "useAbsolutePath": false, "arguments": [], "sourceDirectoryDataStore": null, "framework": "Python", "communicator": "None", "target": "cpu-cluster", "dataReferences": { "model": { "dataStoreName": "workspaceblobstore", "mode": "Mount", "pathOnDataStore": "azureml/{name}/model_out/", "pathOnCompute": null, "overwrite": true } }, "inputs" : null

steps to reproduce:

open attached
create environment from docker file
uncomment lines /src/accident.R ln:27 and ln:46
run az ml create job -f pipeline.yml

accidents-copy.zip l

Author:	jakeatmsft
Assignees:	-
Labels:	`extension/ml`, `customer-reported`, `Machine Learning`, `Service Attention`, `Auto-Assign`
Milestone:	-

Oct 31 '22 08:10 ghost

azure-cli-extensions azure-cli-extensions copied to clipboard

AzureML CLI v2 - R api - Mlflow causing pipeline to break

Related command

Extension name (the extension in question)

Description of issue (in as much detail as possible)

Related command

Extension name (the extension in question)

Description of issue (in as much detail as possible)

azure-cli-extensions
azure-cli-extensions copied to clipboard