argo-workflows
argo-workflows copied to clipboard
Image pull error: User "system:serviceaccount:argo:argo-helm-argo-workflows-workflow-controller" cannot get resource "secrets" in API group "" in the namespace "mynamespace"
Pre-requisites
- [X] I have double-checked my configuration
- [X] I can confirm the issues exists when I tested with
:latest
- [ ] I'd like to contribute the fix myself (see contributing guide)
What happened/what you expected to happen?
We are storing container images of our application in a private image registry. We are deploying Argo using Helm. It seems that the workflow server in the v3.4 tries to read the container image manifest (to lookup the cmd/args) using the "argo-helm-argo-workflows-workflow-controller" service account from the argo namespace. Reading the manifest requires registry access credentials in case of a private image registry and we provide the secret with credentials in deployments: imagePullSecrets: - name: registry-credentials
When we submit a workflow the workflow controller's service account fails to read the registry access credentials from the secret located in the namespace of the application:
Image pull error: User "system:serviceaccount:argo:argo-helm-argo-workflows-workflow-controller" cannot get resource "secrets" in API group "" in the namespace "mynamespace"
Earlier, we have tested one of the latest 3.3.9 builds and it could pull and read the image successfully, see the issue https://github.com/argoproj/argo-workflows/issues/9139
We are using argo service account in the application's namespace to submit workflows (--serviceaccount option) which can read the secret in the same namespace. Would it be possible to use this service account to pull the image manifest? Otherwise the user "system:serviceaccount:argo:argo-helm-argo-workflows-workflow-controller" must be able to read secrets in all namespaces where an application is deployed?
Please explain how to use images from a private registry with access credentials in the v.3.4.0.
Version
3.4.0
Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
[The issue seems to be specific to accessing credentials for private registries from the secret in the application's namespace.]
Logs from the workflow controller
time="2022-09-20T12:46:03.478Z" level=info msg="Processing workflow" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.649Z" level=info msg="Updated phase -> Running" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.649Z" level=info msg="DAG node app-adhoc-ac-db-version-1663677914 initialized Running" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.649Z" level=info msg="All of node app-adhoc-ac-db-version-1663677914.db-version dependencies [] completed" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.656Z" level=info msg="DAG node app-adhoc-ac-db-version-1663677914-749901051 initialized Running" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.656Z" level=info msg="All of node app-adhoc-ac-db-version-1663677914.db-version.db-version-task dependencies [] completed" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.666Z" level=info msg="Pod node app-adhoc-ac-db-version-1663677914-3159602526 initialized Pending" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.674Z" level=error msg="Mark error node" error="failed to look-up entrypoint/cmd for image "myregistry.cloud/releases/myapp:myimage", you must either explicitly specify the command, or list the image's command in the index: https://argoproj.github.io/argo-workflows/workflow-executors/#emissary-emissary: secrets "app-registry-creds" is forbidden: User "system:serviceaccount:argo:argo-helm-argo-workflows-workflow-controller" cannot get resource "secrets" in API group "" in the namespace "san-app-test"" namespace=san-app-test nodeName=app-adhoc-ac-db-version-1663677914.db-version.db-version-task workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.674Z" level=info msg="node app-adhoc-ac-db-version-1663677914-3159602526 phase Pending -> Error" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.674Z" level=info msg="node app-adhoc-ac-db-version-1663677914-3159602526 message: failed to look-up entrypoint/cmd for image "myregistry.cloud/releases/myapp:myimage", you must either explicitly specify the command, or list the image's command in the index: https://argoproj.github.io/argo-workflows/workflow-executors/#emissary-emissary: secrets "app-registry-creds" is forbidden: User "system:serviceaccount:argo:argo-helm-argo-workflows-workflow-controller" cannot get resource "secrets" in API group "" in the namespace "san-app-test"" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.674Z" level=info msg="node app-adhoc-ac-db-version-1663677914-3159602526 finished: 2022-09-20 12:46:03.674633014 +0000 UTC" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.674Z" level=error msg="Mark error node" error="task 'app-adhoc-ac-db-version-1663677914.db-version.db-version-task' errored: failed to look-up entrypoint/cmd for image "myregistry.cloud/releases/myapp:myimage", you must either explicitly specify the command, or list the image's command in the index: https://argoproj.github.io/argo-workflows/workflow-executors/#emissary-emissary: secrets "app-registry-creds" is forbidden: User "system:serviceaccount:argo:argo-helm-argo-workflows-workflow-controller" cannot get resource "secrets" in API group "" in the namespace "san-app-test"" namespace=san-app-test nodeName=app-adhoc-ac-db-version-1663677914.db-version.db-version-task workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.674Z" level=info msg="node app-adhoc-ac-db-version-1663677914-3159602526 message: task 'app-adhoc-ac-db-version-1663677914.db-version.db-version-task' errored: failed to look-up entrypoint/cmd for image "myregistry.cloud/releases/myapp:myimage", you must either explicitly specify the command, or list the image's command in the index: https://argoproj.github.io/argo-workflows/workflow-executors/#emissary-emissary: secrets "app-registry-creds" is forbidden: User "system:serviceaccount:argo:argo-helm-argo-workflows-workflow-controller" cannot get resource "secrets" in API group "" in the namespace "san-app-test"" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.686Z" level=info msg="Outbound nodes of app-adhoc-ac-db-version-1663677914-749901051 set to [app-adhoc-ac-db-version-1663677914-3159602526]" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.686Z" level=info msg="node app-adhoc-ac-db-version-1663677914-749901051 phase Running -> Error" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.686Z" level=info msg="node app-adhoc-ac-db-version-1663677914-749901051 finished: 2022-09-20 12:46:03.686553147 +0000 UTC" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.686Z" level=info msg="Checking daemoned children of app-adhoc-ac-db-version-1663677914-749901051" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="Outbound nodes of app-adhoc-ac-db-version-1663677914 set to [app-adhoc-ac-db-version-1663677914-3159602526]" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="node app-adhoc-ac-db-version-1663677914 phase Running -> Error" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="node app-adhoc-ac-db-version-1663677914 finished: 2022-09-20 12:46:03.69151054 +0000 UTC" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="Checking daemoned children of app-adhoc-ac-db-version-1663677914" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="TaskSet Reconciliation" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg=reconcileAgentPod namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="Updated phase Running -> Error" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="Marking workflow completed" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="Marking workflow as pending archiving" namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.691Z" level=info msg="Checking daemoned children of " namespace=san-app-test workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.696Z" level=info msg="cleaning up pod" action=deletePod key=san-app-test/app-adhoc-ac-db-version-1663677914-1340600742-agent/deletePod time="2022-09-20T12:46:03.704Z" level=info msg="Workflow update successful" namespace=san-app-test phase=Error resourceVersion=100074719 workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.707Z" level=info msg="archiving workflow" namespace=san-app-test uid=e25bc895-59ec-46ae-8e39-4dea893eb0f7 workflow=app-adhoc-ac-db-version-1663677914 time="2022-09-20T12:46:03.727Z" level=info msg="Queueing Error workflow san-app-test/app-adhoc-ac-db-version-1663677914 for delete in 5m0s due to TTL" time="2022-09-20T12:51:04.000Z" level=info msg="Deleting garbage collected workflow 'san-app-test/app-adhoc-ac-db-version-1663677914'" time="2022-09-20T12:51:04.014Z" level=info msg="Successfully deleted 'san-app-test/app-adhoc-ac-db-version-1663677914'"
Logs from in your workflow's wait container
[no output, as the workflow could not be submitted due to manifest pull error]
@vitalyrychkov can you provide your k8s version?
There is PR for supporting v1.24 service account secret change. #9620
@sarabala1979 K8s cluster version: 1.23.2 kubectl client version: 1.23
@terrytangyuan will work on this.
Hi @sarabala1979 and @terrytangyuan
We have tried to use a private image registry with anonymous pull enabled.
We use the same image to start a pod (service) and to submit a task in Argo. The service account of the workflow-controller was given RBAC permissions to read the secret defined in the "imagePullSecrets" parameter of our deployments.
We have tested the following scenarios:
-
Password protected access only. The imagePullSecret exists in our namespace. Our pod starts fine using the registry credentials from the secret. Submitted task starts fine using the registry credentials from the secret.
-
Anonymous access enabled. The imagePullSecret does not exist. Our pod starts fine without using registry credentials. Submitted task fails to lookup entrypoint/cmd with the error message "secrets
not found". -
Anonymous access enabled. The imagePullSecret exists in our namespace. Our pod starts fine. Submitted task starts fine using the registry credentials from the secret.
Seems that if the imagePullSecret is specified in the deployment, the workflow-controller always tries to authenticate instead of anonymous pull? Would it be possible to try first the anonymous and then password-protected pull or to add a parameter to switch between them? Shall we discuss this issue here or open a separate issue?
Shall we discuss this issue here or open a separate issue?
Created a separate issue for this: https://github.com/argoproj/argo-workflows/issues/9802
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.
Bumping this issue, not stale. Also: https://drewdevault.com/2021/10/26/stalebot.html
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.
Feel like I have a similar situation. When running on kind cluster using Tiltdev to build container into private nonsecure docker docker registry(with ctlptl). There is no issue pulling from public registries, but argo workflow just cannot seem to figure out the local registry bit, (while directly pushing a k8s Job there are no issue.)
https://docs.docker.com/registry/deploying/
Scenario available here: https://github.com/kcirtapfromspace/got99prblms
Just add my case for reference:
Context:
- Cloud: Google Cloud
- Argo Workflows 3.4.9 deployed on GKE 1.27 in Project A
- Private Docker Registry (asia.gcr.io) in Project B
I configured the imagePullSecrets for default
service account in default
namespace. (Private images are pulled successfully to default
namespace)
submitted my workflow to this namespace and got this error:
User "system:serviceaccount:argo:argo" cannot get resource "secrets" in API group "" in the namespace "default"
Add a new role, and grant the argo
service account permission:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: secret-reader
namespace: default
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: argo-secret-reader
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: secret-reader
subjects:
- kind: ServiceAccount
name: argo
namespace: argo
The above error gone, but got a new one:
..failed to look-up entrypoint/cmd for image "asia.gcr.io/....", you must either explicitly specify the command, or list the image's command in the index: https://argoproj.github.io/argo-workflows/workflow-executors/#emissary-emissary, DENIED: Permission denied for...
Tried to figure out what's wrong for almost 4 days, even looked at the source code. Finally, found this issue: https://github.com/crossplane/crossplane/issues/3023#issuecomment-1128585699
For anyone who has the same setup as me, you must grant permissions for the Default Compute Engine service account of the project that runs Argo on GKE (Project A) to access the Container Registry/Artifact Registry in the project that hosts the container registry (Project B)
For team, please update to a newer version of go-containerregistry that respects the chain order (k8s first).
Thanks.