ai-on-gke icon indicating copy to clipboard operation
ai-on-gke copied to clipboard

Add requirements for GKE customer on ray-on-gke README

Open brandonroyal opened this issue 1 year ago • 0 comments

TL;DR - the ray-on-gke readme should be updated with the requirements needed for setting up the prerequisite GKE clusters.

  • GKE Cluster must have Workload Identity Enabled
  • GKE Cluster must have KubeRay Operator deployed

Here are the details on the errors when these prerequisites are NOT met. These are based on a GKE Standard cluster (1.27.3-gke.100)

module.service_accounts.google_project_iam_binding.monitoring-viewer: Creation complete after 7s [id=ie-raycluster-0f2aa542/roles/monitoring.viewer]
╷
│ Error: unable to build kubernetes objects from release manifest: resource mapping not found for name: "example-cluster-kuberay" namespace: "" from "": no matches for kind "RayCluster" in version "ray.io/v1alpha1"
│ ensure CRDs are installed first
│ 
│   with module.kuberay.helm_release.ray-cluster,
│   on modules/kuberay/kuberay.tf line 15, in resource "helm_release" "ray-cluster":
│   15: resource "helm_release" "ray-cluster" {
│ 
╵

This is a result of the KubeRay operator not being installed on the cluster

The iam-role-binding will also fail. This is the result of Workload Identity not being enabled.

brandonroyal avatar Sep 28 '23 14:09 brandonroyal