tfx icon indicating copy to clipboard operation
tfx copied to clipboard

ml-pipelines-sdk does not run on windows

Open yonil7 opened this issue 3 years ago • 3 comments

(This issue follows #950 and is more focused)

The following code does not work on windows:

@component
def test_component():
    print('hi from test component')

def main():
    pipe = Pipeline(
        pipeline_name='test-pipe',
        pipeline_root=r'c:\tmp\tfx-root',
        components=[test_component()],
        metadata_connection_config=sqlite_metadata_connection_config(r'c:\tmp\tfx-metadata.db')
    )
    LocalDagRunner().run(pipe)

The error is:

OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'c:\\tmp\\tfx-root\\test_component\\.system\\stateful_working_dir\\2021-11-18T18:56:34.733472'

This happens because in windows the character : is not allowed to be used in file/folder names. (the file c:\tmp\tfx-metadata.db and all the folders up until stateful_working_dir are created succesfully)

This means GCP Vertex AI TFX pipelines cant be developed/tested/run on windows.

BTW, I didn't find such limitation mentioned on the Vertex AI product documentation.

yonil7 avatar Nov 18 '21 17:11 yonil7

FYI, in tfx\orchestration\portable\outputs_utils.py, self._pipeline_run_id.replace(':', '_') fixes the issue in the get_stateful_working_directory function. For TFX 1.3.4, it looks like so : image

aurelienmorgan avatar Jun 20 '22 13:06 aurelienmorgan

Is there any update on this?

yonil7 avatar Aug 02 '22 05:08 yonil7

@yonil7,

This happens because in windows the character : is not allowed to be used in file/folder names. As mentioned in above comments, suggested workaround is by adding self._pipeline_run_id.replace(':', '_') in the get_stateful_working_directory function in tfx\orchestration\portable\outputs_utils.py file.

Kindly let me know if it helps. Thank you!

singhniraj08 avatar Oct 14 '22 10:10 singhniraj08