zenml icon indicating copy to clipboard operation
zenml copied to clipboard

[BUG]: Windows OSError when adding s3 artifact-store

Open Val3nt-ML opened this issue 3 years ago โ€ข 4 comments

Contact Details [Optional]

[email protected]

System Information

  • ZenML version: 0.9.0
  • Install path: C:\Users\vlaurent\Anaconda3\envs\re-estimation-38\Lib\site-packages\zenml
  • Python version: 3.8.13
  • Platform information: {'os': 'windows', 'windows_version_release': '10', 'windows_version': '10.0.19044', 'windows_version_service_pack': 'SP0', 'windows_version_os_type': 'Multiprocessor Free'}
  • Environment: native
  • Integrations: ['dash', 'evidently', 'facets', 'lightgbm', 'mlflow', 's3', 'scipy', 'sklearn', 'xgboost']

What happened?

Hello Zen-ML Team, As we discussed on Slack with @AlexejPenner and @htahir1, I'm encountering some issues when I defined a S3 bucket as the artifact-store of my stack on Windows, locally. I've faced it with a simple pipeline composed of 3 steps :

  • step 1 = create_df1
  • step2 = create_df2
  • step3=process_df (merge these df together)

My Zen-ML repository is the following C:\\Users\\vlaurent\\My_Git_Repo\\real-estate-pricer\\src\\exploration

My stack is composed of the following components:

  • Orchestrator : default
  • Metadata-store: default
  • Artifact-store: custom s3 configured as the following :

image

I've tried the exact same code and stack on Linux and it works properly. I got an OSError issue as following. image

It seems that ZenML tries to register the path of my bucket in the local repository, causing this issue.

PS: I don't know if it's correlated but on Windows machine when I tried to execute a pipeline with pipeline.run() and execute this line directly in a .py file (executed in Python Console) I encountered the same type or issue. Thanks in advance. Valentin LAURENT

Reproduction steps

  1. Add a remote s3 bucket as artifact-store of your Zen-ML stack
  2. Run any Zen-ML pipeline on Windows

Relevant log output

โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ C:\Users\vlaurent\My_Git_Repo\real-estate-pricer\src\exploration\test_zenml.py:56 in <module>    โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   53                                                                                             โ”‚
โ”‚   54 if __name__ == "__main__":                                                                  โ”‚
โ”‚   55 โ”‚   my_pipe = second_pipeline(step1=create_df1(), step2=create_df2(), step3=process_df()    โ”‚
โ”‚ โฑ 56 โ”‚   my_pipe.run()                                                                           โ”‚
โ”‚   57                                                                                             โ”‚
โ”‚   58                                                                                             โ”‚
โ”‚                                         โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ locals โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚
โ”‚ โ”‚ onerror = <function rmtree.<locals>.onerror at 0x000001DDE0AC6280>                           โ”‚ โ”‚
โ”‚ โ”‚    path = 'C:\\Users\\vlaurent\\My_Git_Repo\\real-estate-pricer\\src\\exploration\\s3:\\datโ€ฆ โ”‚ โ”‚
โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ C:\Users\vlaurent\Anaconda3\envs\re-estimation-38\lib\shutil.py:596 in _rmtree_unsafe            โ”‚
โ”‚                                                                                                  โ”‚
โ”‚    593 # version vulnerable to race conditions                                                   โ”‚
โ”‚    594 def _rmtree_unsafe(path, onerror):                                                        โ”‚
โ”‚    595 โ”‚   try:                                                                                  โ”‚
โ”‚ โฑ  596 โ”‚   โ”‚   with os.scandir(path) as scandir_it:                                              โ”‚
โ”‚    597 โ”‚   โ”‚   โ”‚   entries = list(scandir_it)                                                    โ”‚
โ”‚    598 โ”‚   except OSError:                                                                       โ”‚
โ”‚    599 โ”‚   โ”‚   onerror(os.scandir, path, sys.exc_info())                                         โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ locals โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚
โ”‚ โ”‚ onerror = <function rmtree.<locals>.onerror at 0x000001DDE0AC6280>                           โ”‚ โ”‚
โ”‚ โ”‚    path = 'C:\\Users\\vlaurent\\My_Git_Repo\\real-estate-pricer\\src\\exploration\\s3:\\datโ€ฆ โ”‚ โ”‚
โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
OSError: [WinError 123] La syntaxe du nom de fichier, de rรฉpertoire ou de volume est incorrecte:
'C:\\Users\\vlaurent\\My_Git_Repo\\real-estate-pricer\\src\\exploration\\s3:\\data-warehouse\\zenml_artifact_store\\create_df1\\.system\\stateful_working_dir'

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

Val3nt-ML avatar Jun 22 '22 15:06 Val3nt-ML

Hey @Val3nt-ML I may have found your issue today. We were using pathlib.Path to manipulate the artifact-store paths which stripped one of the backslashes away from the s3-path. This lead to the materializer of the stepoutputs to try creating these directories locally on your machine.

The PR to fix this can be found here

It would be awesome if you could try pip-installing this branch to see if this error is solved for you.

AlexejPenner avatar Jun 28 '22 15:06 AlexejPenner

Ok thanks, I'll try it and keep you informed !

Val3nt-ML avatar Jun 29 '22 16:06 Val3nt-ML

I've just tested It, and It seems that I still have the same issue :(

OSError: [WinError 123] La syntaxe du nom de fichier, de rรฉpertoire ou de volume est incorrecte: 'C:\\Users\\vlaurent\\My_Git_Repo\\Test_ZenML\\s3:\\data-warehouse\\zenml_artifact_store\\fetch_input\\.system\\stateful_working_dir'

Val3nt-ML avatar Jul 01 '22 15:07 Val3nt-ML

Hi @Val3nt-ML, we're unable to replicate the issue on this branch. Can you maybe try in a fresh virtual environment in which you installed ZenML from this branch: pip install git+https://github.com/zenml-io/zenml.git@bugfix/windows-source-utils. (Just to avoid any potential issues with pip not upgrading the package)

schustmi avatar Jul 05 '22 12:07 schustmi