tfx
tfx copied to clipboard
local dag runner TFX pipeline run create ERROR Failed to make stateful working dir ; Protocol error
OS : Linux Ubuntu 21.04. TensorFlow version: 2.6.2 TFX version: 1.4.0 Python 3.8.0
Hello, I am following Building a TFX Pipeline Locally (https://www.tensorflow.org/tfx/guide/build_local_pipeline). I am only running CsvExampleGen component and I am getting the following error:
ERROR:absl:Failed to make stateful working dir: ./my_pipeline_output/CsvExampleGen/.system/stateful_working_dir/2022-01-05T11:04:16.463569 Traceback (most recent call last):........ File "/home/mc/anaconda3/envs/tfx_linux/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 514, in recursive_create_dir_v2 _pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path)) tensorflow.python.framework.errors_impl.UnknownError: ./my_pipeline_output/CsvExampleGen/.system/stateful_working_dir/2022-01-05T11:04:16.463569; Protocol error
I looked into tensorflow/python/lib/io/file_io.py -> function recursive_create_dir_v2() but that is it! :-). I would appreciate any suggestion. I am certainly missing something ... Thanks
@miroC911,
Can you share the complete error trace and the line which triggered this error in the example program? Thanks!
@sanatmpa1
Hello, please see below. If you need more info please let me know. with thanks
Error trace: NFO:absl:MetadataStore with DB connection initialized INFO:absl:select span and version = (0, None) INFO:absl:latest span and version = (0, None) INFO:absl:MetadataStore with DB connection initialized INFO:absl:Going to run a new execution 2 ERROR:absl:Failed to make stateful working dir: ./my_pipeline_output/CsvExampleGen/.system/stateful_working_dir/2022-01-11T13:41:58.037527 Traceback (most recent call last): File "/home/mc/anaconda3/envs/tfx_linux/lib/python3.8/site-packages/tfx/orchestration/portable/outputs_utils.py", line 220, in get_stateful_working_directory fileio.makedirs(stateful_working_dir) File "/home/mc/anaconda3/envs/tfx_linux/lib/python3.8/site-packages/tfx/dsl/io/fileio.py", line 78, in makedirs _get_filesystem(path).makedirs(path) File "/home/mc/anaconda3/envs/tfx_linux/lib/python3.8/site-packages/tfx/dsl/io/plugins/tensorflow_gfile.py", line 71, in makedirs tf.io.gfile.makedirs(path) File "/home/mc/anaconda3/envs/tfx_linux/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 514, in recursive_create_dir_v2 _pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path)) tensorflow.python.framework.errors_impl.UnknownError: ./my_pipeline_output/CsvExampleGen/.system/stateful_working_dir/2022-01-11T13:41:58.037527; Protocol error
triggers:
File "/home/mc/git/TFX_Tutorials/TFX_tutorial/example_TFX_pipeline/my_pipeline.py", line 53, in run_pipeline
tfx.orchestration.LocalDagRunner().run(my_pipeline)
File "/home/mc/git/TFX_Tutorials/TFX_tutorial/example_TFX_pipeline/my_pipeline.py", line 57, in
@miroC911,
Can you share a simple standalone code or colab gist to reproduce the issue? Thanks!
@sanatmpa1 Hello, with thanks. https://gist.github.com/miroC911/da835e2e1c5c22b7cb1c54e530962589
I'm running W10 and the problem is the method LocalDagRunner.run_with_ir
(tfx.orchestration.local.local_dag_runner.py) when substituting the runtime parameter to be a concrete run_id, it's replacing pipeline_run_id with datetime.datetime.now().isoformat()
which returns colon for separator between HH MM and SS. This is not allowed in Windows file names
I'm running W10 and the problem is the method
LocalDagRunner.run_with_ir
(tfx.orchestration.local.local_dag_runner.py) when substituting the runtime parameter to be a concrete run_id, it's replacing pipeline_run_id withdatetime.datetime.now().isoformat()
which returns colon for separator between HH MM and SS. This is not allowed in Windows file names
@see tfx/issues/4474
In tfx\orchestration\portable\outputs_utils.py, self.pipeline_run_id.replace(':', '')
fixes the issue in the get_stateful_working_directory
function
@miroC911 As mentioned above, this issue is specific to windows path and a workaround is mentioned here. Lets close this issue and track it here. Thanks!!
Agree to close the issue. Please send any follow up request to @ruoyu90