azure-sdk-for-python
azure-sdk-for-python copied to clipboard
AzureML SDK download_file slow for big files.
I am using azure ml SDK to download outputs of AML jobs. I am accessing job runs via azureml.core.Run class. I am suffering from extremly slow and timed out downloads. However I have neither access to the blob URI nor to the parameters of BlobServiceClient. Please help me how to setup Run.download_file so that I can control chunk size, timeout, etc or how can I get the blob object URI in a supported way.
Hi @egborbe, thanks for the feedback - @azureml-github will get back to you asap.
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/azure-ml-sdk @azureml-github.
Could you share the SDK version details for us to engage?
Thanks for coming back to me. I use python3.10.7 with poetry on Ubuntu WSL. azureml-core = "^1.47.0" thats the onlything I have in my poetry setup file (pyproject.toml)
Hi @egborbe , Could you please share code which you are using to consume azureml.core.Run download?
Sure!
workspace = Workspace(
run_config.subscription_id,
run_config.resource_group,
run_config.workspace
)
for exp_name in run_config.pred_experiments:
exp = Experiment(workspace, exp_name)
test_tmp = os.path.join(tmp_root, exp_name)
os.makedirs(test_tmp)
for pred_run in exp.get_runs():
status = pred_run.get_status()
details = pred_run.get_details()
run_id = details['runId']
if status != "Completed":
print(f"Ignoring run {run_id}, status: {status}")
continue
metrics = pred_run.get_metrics()
tags = pred_run.get_tags()
filenames = set(pred_run.get_file_names())
run_definition = pred_run.get_details()["runDefinition"]
ckp_num, aml_train_id = cls.parse_run_definition( run_definition)
if 'checkpoint_number' in metrics.keys():
ckp_num = metrics['checkpoint_number']
if ckp_num is None:
raise RuntimeError(f"Could not find checkpoint number for run {details['runId']}")
if 'aml_run_id' in metrics.keys():
aml_train_id = metrics['aml_run_id']
elif 'training_run_id' in tags.keys():
aml_train_id = tags['training_run_id']
if aml_train_id is None:
raise RuntimeError(f"Could not find training run id for run {details['runId']}")
# sometimes train id is a list of identical elements, maybe a bug in AML?
if type(aml_train_id) in [list, tuple]:
aml_train_id = aml_train_id[0]
if isinstance(ckp_num, list):
ckp_num = int(ckp_num[0])
else:
ckp_num = int(ckp_num)
aml_train_ids[ckp_num] = aml_train_id
checkpoints[exp_name].append(ckp_num)
target_dir = os.path.join(test_tmp, str(ckp_num), "outputs")
if os.path.exists(target_dir):
print(f"Ignoring run {details['runId']}, already downloaded this epoch")
continue
os.makedirs(target_dir)
# files I need
pattern = re.compile(r'^outputs/.*(?:bla|gaga||true).*\.npy$', re.IGNORECASE)
for filename in filenames:
if pattern.match(filename):
pred_run.download_file(filename, target_dir)`
Actually the best solution for me would be if I could just obtain the Azure Blob Storage URI of the Run files and do the whole download code by myself, using requests or the blob storage client.
HI @egborbe , Hope below approach would be reach to your requirement to get path.
You can get url from this
Hi, thanks for the answer but I need the URI from python code and not the GUI. As this is an automated report generation tool and I cannot have the users look up the URI references manually for each run.
Hi @egborbe,
We can form string uri using below code.
from azureml.core import Workspace
workspace_name = 'iqbal-test-download-ws' # specify workspace here
datastore_name = 'workspaceartifactstore' #specify datastore here
# init workspace
ws = Workspace(
subscription_id="b17253fa-f327-42d6-9686-f3e553e24763",# specify subscription_id here
resource_group="v-isaudagar-download-rg",# specify resource_group here
workspace_name=workspace_name
)
#Storage URI
strUri = f'{ws.datastores[datastore_name].protocol}://{ws.datastores[datastore_name].account_name}.{ws.datastores[datastore_name].datastore_type.replace("Azure", "")}.{ws.datastores[datastore_name].endpoint}.{ws.datastores[datastore_name].container_name}'
Hi @egborbe. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.
Hi! What is the question? Thank you for your recommendation, however I havent had the opportunity to try it out as this issue has dropped in priority internally. Thanks for your help, again.
HI @egborbe, Hope you are good with provided approach. could you please confirm?
Hi @egborbe. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.
Hi @egborbe, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!