Continuous-Adaptation-with-VertexAI-AutoML-Pipeline
Continuous-Adaptation-with-VertexAI-AutoML-Pipeline copied to clipboard
Continuous Adaptation with VertexAI's AutoML and Pipeline
This repository contains two notebooks to demonstrate how to automate to produce a new AutoML model when the new dataset comes in. This project uses Vertex AI in general, Vertex Managed Dataset, Vertex Pipeline, Vertex AutoML, Cloud Storage, and Cloud Function in Google Cloud Platform.

About the notebooks
There are two notebooks for this project. Everything can be setup by running each cell in the notebooks. The only thing you need to do manually is to setup IAMs.
IAMs Setup
-
For
Vertex Pipeline, we needVertex Admin,Cloud Storage Viewer,Cloud Storage Editorpermissions(Some ML components need to access the managed dataset, and the Pipeline itself is stored in GCS(Google Cloud Storage) bukcet, so we need the listed permissions.). This can be setup undercomputeservice account sinceVertex Pipelineuses compute engines to run each component of the ML Pipeline. Also don't forget to enablecompute service accountand Vertex AI API. -
For
Cloud Function, we needVertex AdminandCloud Buildenabled. Since the docker image that theCloud Functionbases should be built byCloud Build, we needCloud Build APIenabled. Also,Cloud Funcionwill trigger theVertex Pipeline, so we needVertex Admin
Notebooks
- cifar-10-vertex-autoML-pipeline
- This notebook should be run before the second notebook. It will prepare the
Kubeflow Pipelinewith two additional custom components which determines if there is existing dataset or not. The entire notebook produces the pipeline specjsonfile and put it in the GCS bucket.
- prepare-cifar10-subset
- This notebook creates two subsets of CIFAR10 dataset for simulating purpose. It also provides the codebase for
Cloud Function, and you can directly deploy it within the notebook. Lastly, it tries to simulate the continuous adaptation scenario by putting each subset of data sequentially.