ucaip-labs
ucaip-labs copied to clipboard
Dataflow crashing in Vertex AI pipeline
When running Dataflow from Vertex AI pipeline in ucaip-labs/04-pipeline-deployment.ipynb, the test runs fine at the start of the notebook but when I start the pipeline with JSON pipeline definition file I get the error when the pipeline reaches to Data flow that "ModuleNotFoundError: No module named 'user_module_0'
I have tried some fixes but none work;
- Adding the path of the transformation module in the setup.py file.
- placing the transformation module in GCS and changing the path to GCP in the TFX component.
- placing the transformation module in the root directory and providing the absolute path.
- moving the repo to GCS.
The following notebooks run fine without any issue; ucaip-labs/01-dataset-management.ipynb ucaip-labs/02-experimentation.ipynb ucaip-labs/03-training-formalization.ipynb
I am not sure want what I am missing, I wasn't able to find any solutions on other platforms that work for me. I would really appreciate it if you could recommend me a fix.
I have tried with both Chicago Taxi Trips Data and another retail data set both have the same issue with DataFlow.