cartography icon indicating copy to clipboard operation
cartography copied to clipboard

[Feature] Add support for GCP Vertex AI

Open kunaals opened this issue 2 weeks ago • 0 comments

Summary

Add support for ingesting GCP Vertex AI resources into Cartography. Vertex AI is Google Cloud's unified machine learning platform for building, deploying, and scaling ML models. This feature would allow Cartography to track Vertex AI models, endpoints, training pipelines, notebooks, and feature stores.

Motivation

GCP Vertex AI is increasingly adopted for enterprise ML/AI workloads and represents a critical component of cloud infrastructure that needs security visibility. By ingesting Vertex AI resources, Cartography can surface:

  • Model deployments and their serving endpoints
  • Training pipelines and their data sources (GCS buckets, BigQuery)
  • Workbench notebook instances and their service accounts
  • Feature stores and their underlying data
  • Model registry entries and versioning
  • Prediction endpoints and their IAM policies

This unlocks graph-based security analysis such as:

  • Identifying notebooks with overly permissive service accounts
  • Tracking data lineage from GCS/BigQuery through training to deployed models
  • Auditing which principals can invoke prediction endpoints
  • Detecting publicly accessible endpoints
  • Mapping ML pipelines and their GCP resource dependencies

Proposed Solution

Extend the GCP intel module to call the Vertex AI APIs and model the following resources:

New Nodes:

  • GCPVertexAIModel - Trained models
  • GCPVertexAIEndpoint - Model serving endpoints
  • GCPVertexAIDeployedModel - Models deployed to endpoints
  • GCPVertexAINotebook - Workbench notebook instances
  • GCPVertexAITrainingPipeline - Training pipelines
  • GCPVertexAIFeatureStore - Feature stores
  • GCPVertexAIDataset - Training datasets

New Relationships:

  • (:GCPProject)-[:RESOURCE]->(:GCPVertexAIModel)
  • (:GCPVertexAIEndpoint)-[:SERVES]->(:GCPVertexAIDeployedModel)
  • (:GCPVertexAIDeployedModel)-[:INSTANCE_OF]->(:GCPVertexAIModel)
  • (:GCPVertexAINotebook)-[:USES_SERVICE_ACCOUNT]->(:GCPServiceAccount)
  • (:GCPVertexAITrainingPipeline)-[:READS_FROM]->(:GCSBucket)
  • (:GCPVertexAITrainingPipeline)-[:READS_FROM]->(:BigQueryDataset)
  • (:GCPVertexAITrainingPipeline)-[:PRODUCES]->(:GCPVertexAIModel)
  • (:GCPVertexAIModel)-[:STORED_IN]->(:GCSBucket)
  • (:GCPVertexAIFeatureStore)-[:BACKED_BY]->(:BigQueryDataset)

GCP APIs to integrate:

  • aiplatform.googleapis.com - Vertex AI API
    • projects.locations.models.list
    • projects.locations.endpoints.list
    • projects.locations.notebookRuntimes.list
    • projects.locations.trainingPipelines.list
    • projects.locations.featurestores.list
    • projects.locations.datasets.list

Alternatives Considered

  • Using Cloud Asset Inventory for Vertex AI resources - CAI coverage for Vertex AI is limited
  • Focusing only on endpoints - misses the training pipeline and data lineage aspects

Relevant Links

kunaals avatar Dec 08 '25 19:12 kunaals