cartography
cartography copied to clipboard
[Feature] Add support for GCP Vertex AI
Summary
Add support for ingesting GCP Vertex AI resources into Cartography. Vertex AI is Google Cloud's unified machine learning platform for building, deploying, and scaling ML models. This feature would allow Cartography to track Vertex AI models, endpoints, training pipelines, notebooks, and feature stores.
Motivation
GCP Vertex AI is increasingly adopted for enterprise ML/AI workloads and represents a critical component of cloud infrastructure that needs security visibility. By ingesting Vertex AI resources, Cartography can surface:
- Model deployments and their serving endpoints
- Training pipelines and their data sources (GCS buckets, BigQuery)
- Workbench notebook instances and their service accounts
- Feature stores and their underlying data
- Model registry entries and versioning
- Prediction endpoints and their IAM policies
This unlocks graph-based security analysis such as:
- Identifying notebooks with overly permissive service accounts
- Tracking data lineage from GCS/BigQuery through training to deployed models
- Auditing which principals can invoke prediction endpoints
- Detecting publicly accessible endpoints
- Mapping ML pipelines and their GCP resource dependencies
Proposed Solution
Extend the GCP intel module to call the Vertex AI APIs and model the following resources:
New Nodes:
GCPVertexAIModel- Trained modelsGCPVertexAIEndpoint- Model serving endpointsGCPVertexAIDeployedModel- Models deployed to endpointsGCPVertexAINotebook- Workbench notebook instancesGCPVertexAITrainingPipeline- Training pipelinesGCPVertexAIFeatureStore- Feature storesGCPVertexAIDataset- Training datasets
New Relationships:
(:GCPProject)-[:RESOURCE]->(:GCPVertexAIModel)(:GCPVertexAIEndpoint)-[:SERVES]->(:GCPVertexAIDeployedModel)(:GCPVertexAIDeployedModel)-[:INSTANCE_OF]->(:GCPVertexAIModel)(:GCPVertexAINotebook)-[:USES_SERVICE_ACCOUNT]->(:GCPServiceAccount)(:GCPVertexAITrainingPipeline)-[:READS_FROM]->(:GCSBucket)(:GCPVertexAITrainingPipeline)-[:READS_FROM]->(:BigQueryDataset)(:GCPVertexAITrainingPipeline)-[:PRODUCES]->(:GCPVertexAIModel)(:GCPVertexAIModel)-[:STORED_IN]->(:GCSBucket)(:GCPVertexAIFeatureStore)-[:BACKED_BY]->(:BigQueryDataset)
GCP APIs to integrate:
aiplatform.googleapis.com- Vertex AI APIprojects.locations.models.listprojects.locations.endpoints.listprojects.locations.notebookRuntimes.listprojects.locations.trainingPipelines.listprojects.locations.featurestores.listprojects.locations.datasets.list
Alternatives Considered
- Using Cloud Asset Inventory for Vertex AI resources - CAI coverage for Vertex AI is limited
- Focusing only on endpoints - misses the training pipeline and data lineage aspects
Relevant Links
- Vertex AI Documentation
- Vertex AI API Reference
- Vertex AI Security Best Practices
- Related to Issue #415 - Extend GCP Support