ucaip-labs icon indicating copy to clipboard operation
ucaip-labs copied to clipboard

Dataflow crashing in Vertex AI pipeline

Open miansaadahmad opened this issue 3 years ago • 0 comments

When running Dataflow from Vertex AI pipeline in ucaip-labs/04-pipeline-deployment.ipynb, the test runs fine at the start of the notebook but when I start the pipeline with JSON pipeline definition file I get the error when the pipeline reaches to Data flow that "ModuleNotFoundError: No module named 'user_module_0'

I have tried some fixes but none work;

  1. Adding the path of the transformation module in the setup.py file.
  2. placing the transformation module in GCS and changing the path to GCP in the TFX component.
  3. placing the transformation module in the root directory and providing the absolute path.
  4. moving the repo to GCS.

The following notebooks run fine without any issue; ucaip-labs/01-dataset-management.ipynb ucaip-labs/02-experimentation.ipynb ucaip-labs/03-training-formalization.ipynb

I am not sure want what I am missing, I wasn't able to find any solutions on other platforms that work for me. I would really appreciate it if you could recommend me a fix.

I have tried with both Chicago Taxi Trips Data and another retail data set both have the same issue with DataFlow.

miansaadahmad avatar Aug 25 '21 09:08 miansaadahmad