tfx
tfx copied to clipboard
ml-pipelines-sdk does not run on windows
(This issue follows #950 and is more focused)
The following code does not work on windows:
@component
def test_component():
print('hi from test component')
def main():
pipe = Pipeline(
pipeline_name='test-pipe',
pipeline_root=r'c:\tmp\tfx-root',
components=[test_component()],
metadata_connection_config=sqlite_metadata_connection_config(r'c:\tmp\tfx-metadata.db')
)
LocalDagRunner().run(pipe)
The error is:
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'c:\\tmp\\tfx-root\\test_component\\.system\\stateful_working_dir\\2021-11-18T18:56:34.733472'
This happens because in windows the character :
is not allowed to be used in file/folder names. (the file c:\tmp\tfx-metadata.db
and all the folders up until stateful_working_dir
are created succesfully)
This means GCP Vertex AI TFX pipelines cant be developed/tested/run on windows.
BTW, I didn't find such limitation mentioned on the Vertex AI product documentation.
FYI, in tfx\orchestration\portable\outputs_utils.py, self._pipeline_run_id.replace(':', '_')
fixes the issue in the get_stateful_working_directory
function.
For TFX 1.3.4, it looks like so :
Is there any update on this?
@yonil7,
This happens because in windows the character : is not allowed to be used in file/folder names. As mentioned in above comments, suggested workaround is by adding self._pipeline_run_id.replace(':', '_')
in the get_stateful_working_directory
function in tfx\orchestration\portable\outputs_utils.py
file.
Kindly let me know if it helps. Thank you!