zenml icon indicating copy to clipboard operation
zenml copied to clipboard

[BUG]: kaniko pod `serviceAccountName` specified in wrong scope

Open cameronraysmith opened this issue 1 year ago • 0 comments

The kaniko image-builder component serviceAccountName is used

https://github.com/zenml-io/zenml/blob/8217105bf2ed15d935f1fd134a72875cd873a994/src/zenml/integrations/kaniko/image_builders/kaniko_image_builder.py#L204

and then placed inside the containers scope

https://github.com/zenml-io/zenml/blob/8217105bf2ed15d935f1fd134a72875cd873a994/src/zenml/integrations/kaniko/image_builders/kaniko_image_builder.py#L221

but it likely needs to be declared in the Pod spec scope

https://github.com/zenml-io/zenml/blob/8217105bf2ed15d935f1fd134a72875cd873a994/src/zenml/integrations/kaniko/image_builders/kaniko_image_builder.py#L210

Without this, all kaniko pods are assigned the default service account,

  serviceAccount: default
  serviceAccountName: default  

even when a different service account has been registered with the kaniko image-builder component

❯ zenml image-builder describe kaniko  
Image_Builder 'kaniko' of flavor 'kaniko' with id 'e66facf7-4e41-4aa4-8ab7-dbac0005bdff' is owned by user 'default' and is 'private'.
        'kaniko' IMAGE_BUILDER Component Configuration (ACTIVE)        
┏━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ COMPONENT_PROPERTY   │ VALUE                                        ┃
┠──────────────────────┼──────────────────────────────────────────────┨
┃ KUBERNETES_CONTEXT   │ gke_project_region_cluster                   ┃
┠──────────────────────┼──────────────────────────────────────────────┨
┃ KUBERNETES_NAMESPACE │ kaniko                                       ┃
┠──────────────────────┼──────────────────────────────────────────────┨
┃ SERVICE_ACCOUNT_NAME │ kaniko                                       ┃
┗━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

If the default account in the relevant namespace doesn't have a workload identity-bound account with the artifactregistry.repositories.uploadArtifacts permission, then the pod fails with

INFO[0000] To simulate EOF and exit, press 'Ctrl+D'     
error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for "us.gcr.io/project/pipelines/zenml:model_training_pipeline-orchestrator": creating push check transport for us.gcr.io failed: GET https://us.gcr.io/v2/token?scope=repository%3Aproject%2Fpipelines%2Fzenml%3Apush%2Cpull&service=us.gcr.io: DENIED: Permission "artifactregistry.repositories.uploadArtifacts" denied on resource "projects/project/locations/us/repositories/us.gcr.io" (or it may not exist)
Stream closed EOF for kaniko/kaniko-build-822afce0 (kaniko-build-822afce0)

A workaround is to simply apply the same RoleBinding and workload identity annotation to the default service account as for the registered one in the namespace where kaniko is running.

cameronraysmith avatar Oct 21 '23 06:10 cameronraysmith