azure-sdk-for-python icon indicating copy to clipboard operation
azure-sdk-for-python copied to clipboard

Bug while updating the train_component in Python SDK v2 notebook

Open thomassantosh opened this issue 2 years ago • 4 comments

  • Operating System: macOS Ventura
  • Python Version: Python 3.8.15

Describe the bug In the past few weeks, when I've tried to run the tutorial notebook at this link: https://github.com/Azure/azureml-examples/blob/main/tutorials/e2e-ds-experience/e2e-ml-workflow.ipynb, I keep running into an issue when I execute this line of code: train_component = ml_client.create_or_update(train_component).

The error returned is AssetException: Error creating codes asset : Directory <my filepath>/components/train is empty. path or local_path must be a non-empty directory. However, the file path does contain the train.py and the train.yml file from the prior executed steps.

I've run this tutorial successfully before with the same code. This error seems to have come up only in the past month.

I've included the traceback below: traceback.txt

To Reproduce Steps to reproduce the behavior:

  1. Go through the tutorial until you hit the above code line.

Expected behavior Completion of the above step successfully. No errors while running train_component = ml_client.create_or_update(train_component).

Screenshots Added the traceback file above.

Additional context Does this come up in the automated testing cycles? Have builds of this notebook failed?

thomassantosh avatar Dec 21 '22 16:12 thomassantosh

Thanks for the feedback, /cc @azureml-github for input.

pvaneck avatar Dec 22 '22 00:12 pvaneck

Hi, just checking in on this issue. I just ran through the process again and am still getting stuck at this stage.

@luigiw - any thoughts on what's causing this?

thomassantosh avatar Jan 03 '23 15:01 thomassantosh

Hi @thomassantosh , I used mac to run the tutorial but did not repro the error. My environment is:

OS: Ventura 13.1 azure-ai-ml=1.2.0 python=3.8.13

Which sdk version do you use? Have you tried re-create a new python environment and run the tutorial?

0mza987 avatar Jan 04 '23 09:01 0mza987

Hi @thomassantosh , I used mac to run the tutorial but did not repro the error. My environment is:

OS: Ventura 13.1 azure-ai-ml=1.2.0 python=3.8.13

Which sdk version do you use? Have you tried re-create a new python environment and run the tutorial?

Hi @0mza987 , I've been using this repo to go through the workflow -> https://github.com/ts-azure-services/aml-managed-endpoints This is a copy of a notebook from the azureml-samples, but extends it further to test out batch endpoints as well since I was curious about the limits of the managed online endpoint. If you look at the Makefile, I'm using python=3.8, and installing azure-ai-ml in my local condo environment. However, I'm not specifically pinning any version dependencies. Perhaps, I should. But I assume when I pip install the latest azure-ai-ml package that it should be backward compatible and not break especially for this scenario. Can you try running this notebook, and see if you get the same error given the same dependencies? Otherwise, happy to sync offline.

thomassantosh avatar Jan 04 '23 15:01 thomassantosh

I can repro this error running the notebook you provided, and I even repro-ed in Windows as well, not sure about the root cause yet but I'll take a further investigation. Will update here soon.

0mza987 avatar Jan 05 '23 12:01 0mza987

I can repro this error running the notebook you provided, and I even repro-ed in Windows as well, not sure about the root cause yet but I'll take a further investigation. Will update here soon.

Great! Thanks @0mza987 ! Look forward to resolving this.

thomassantosh avatar Jan 05 '23 15:01 thomassantosh

+1, I'm having the same issue

karishma-dixit avatar Jan 05 '23 17:01 karishma-dixit

This looks like an example issue to me. As I don't see the referenced folder, ./component/train, in the same folder as the notebook. I'll find the owner to fix this.

luigiw avatar Jan 05 '23 23:01 luigiw

@luigiw let me take care of this issue, I'll investigate and report findings here asap.

jfomhover avatar Jan 05 '23 23:01 jfomhover

@luigiw The referenced folder is created inside notebook, customer run this notebook in his own repo and the .gitignore file has excluded thoese snapshot files. @thomassantosh Looks like the .gitignore file in your repo has ignored these folders and caused this error.

image

Try delete these lines and the error should not be raised again.

Also you can refer to this link about ignore rules when creating snapshot:

image

0mza987 avatar Jan 06 '23 03:01 0mza987

Doh! Good catch. I forgot that being part of an upload operation, it might've referred to the .gitignore file. I've added the .amlignore file with less restrictive inclusions. Thanks! Will close this.

thomassantosh avatar Jan 06 '23 16:01 thomassantosh