Daemon container displayed failed
Pre-requisites
- [x] I have double-checked my configuration
- [x] I have tested with the
:latestimage tag (i.e.quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on:latest. If not, I have explained why, in detail, in my description below. - [x] I have searched existing issues and could not find a match for this bug
- [x] I'd like to contribute the fix myself (see contributing guide)
What happened? What did you expect to happen?
When using daemon container, in a full succeded Workflow, daemon containers are displayed failed in the UI with exit code 0 :
Daemon container definition :
container:
readinessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 2
periodSeconds: 5
timeoutSeconds: 5
name: redis
image: eu.gcr.io/myrepo/base/redis:6.0.16
imagePullPolicy: Always
ports:
- containerPort: 6379
Pod status in UI :
Redis daemon container logs :
29:signal-handler (1756156059) Received SIGTERM scheduling shutdown...
29:M 25 Aug 2025 21:07:39.423 # User requested shutdown...
29:M 25 Aug 2025 21:07:39.423 * Saving the final RDB snapshot before exiting.
29:M 25 Aug 2025 21:07:39.459 * DB saved on disk
29:M 25 Aug 2025 21:07:39.459 # Redis is now ready to exit, bye bye...
Version(s)
v3.7.1
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflow that uses private images.
ClusterWorkflowTemplate redis Daemon
apiVersion: argoproj.io/v1alpha1
kind: ClusterWorkflowTemplate
metadata:
name: daemon-redis
spec:
entrypoint: daemon-redis
templates:
- name: daemon-redis
daemon: true
terminationGracePeriodSeconds: 5
container:
readinessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 2
periodSeconds: 5
timeoutSeconds: 5
name: redis
image: redis:6.0.16
imagePullPolicy: Always
ports:
- containerPort: 6379
Workflow that use it
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
name: test-workflow
namespace: argo-workflows
spec:
entrypoint: main
serviceAccountName: argo-workflows-server
templates:
- name: main
dag:
tasks:
- name: redis
templateRef:
name: daemon-redis
template: daemon-redis
clusterScope: true
- name: mongodb
templateRef:
name: daemon-mongodb
template: daemon-mongodb
clusterScope: true
- name: tests
depends: redis && mongodb
arguments:
parameters:
- name: composer_command
value: test
templateRef:
name: composer
template: composer
clusterScope: true
Logs from the workflow controller
❯ k logs argo-workflows-exploitation-workflow-controller-6795f478f-fcmcq | grep hhmw5
time="2025-08-26T07:37:02.220Z" level=info msg="Queueing Succeeded workflow argo-workflows/jenkins-test-phpfpm-pr-10-hhmw5 for delete in 4309h30m47s due to TTL"
Logs from in your workflow's wait container
no failed container in my workflow
@mbouillaud please post a reproducible example, the current one uses templates and private images
I've updated my initial post @eduardodbr ;) Sorry
To add more details, daemon steps are green until the hook finish is succeeded :
When hook finish succeeded, daemon steps are marked Failed
Record of pod states during the Workflow : https://share.cleanshot.com/L8hlt1ws
We've got the same issue after upgrading to 0.45.25 of the argo workflows helm chart, which uses image versions v3.7.2. The daemon workflows pods are marked as Failed even though the other pods are successful.
Also, I should mention that our Workflows use templateRef for those daemon tasks in the workflow dag.
Is there any confirmation that this bug will be fixed in the next patch release of 3.7.x?
plan to take a look on the weekend
@mbouillaud @Charliefrodriguez
Could you help us generate a reproducible example? I tried with the following, but couldn't reproduce the issue
# This example demonstrates daemoned steps when used in in DAG templates. It is equivalent to the
# daemon-step.yaml example, but written in DAG format. The IP address of the daemoned step can be
# referenced using the '{{tasks.taskname.ip}}' variable.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: dag-daemon-task-
spec:
hooks:
finish:
template: http
expression: workflow.status == "Succeeded"
running:
expression: workflow.status == "Running"
template: http
entrypoint: daemon-example
templates:
- name: daemon-example
dag:
tasks:
- name: influx
template: influxdb
- name: init-database
template: influxdb-client
depends: "influx"
arguments:
parameters:
- name: cmd
value: curl -XPOST 'http://{{tasks.influx.ip}}:8086/query' --data-urlencode "q=CREATE DATABASE mydb"
- name: producer-1
template: influxdb-client
depends: "init-database"
arguments:
parameters:
- name: cmd
value: for i in $(seq 1 20); do curl -XPOST 'http://{{tasks.influx.ip}}:8086/write?db=mydb' -d "cpu,host=server01,region=uswest load=$i" ; sleep .5 ; done
- name: producer-2
template: influxdb-client
depends: "init-database"
arguments:
parameters:
- name: cmd
value: for i in $(seq 1 20); do curl -XPOST 'http://{{tasks.influx.ip}}:8086/write?db=mydb' -d "cpu,host=server02,region=uswest load=$((RANDOM % 100))" ; sleep .5 ; done
- name: producer-3
template: influxdb-client
depends: "init-database"
arguments:
parameters:
- name: cmd
value: curl -XPOST 'http://{{tasks.influx.ip}}:8086/write?db=mydb' -d 'cpu,host=server03,region=useast load=15.4'
- name: consumer
template: influxdb-client
depends: "producer-1 && producer-2 && producer-3"
arguments:
parameters:
- name: cmd
value: curl --silent -G http://{{tasks.influx.ip}}:8086/query?pretty=true --data-urlencode "db=mydb" --data-urlencode "q=SELECT * FROM cpu"
- name: influxdb
daemon: true
container:
image: influxdb:1.2
readinessProbe:
httpGet:
path: /ping
port: 8086
initialDelaySeconds: 5
timeoutSeconds: 1
- name: influxdb-client
inputs:
parameters:
- name: cmd
container:
image: appropriate/curl:latest
command: ["sh", "-c"]
args: ["{{inputs.parameters.cmd}}"]
- name: http
http:
url: "https://raw.githubusercontent.com/argoproj/argo-workflows/4e450e250168e6b4d51a126b784e90b11a0162bc/pkg/apis/workflow/v1alpha1/generated.swagger.json"
This issue has been automatically marked as stale because it has not had recent activity and needs more information. It will be closed if no further activity occurs.
This issue has been closed due to inactivity and lack of information. If you still encounter this issue, please add the requested information and re-open.
@tczhao Can this be reopened? We're still experiencing this issue on the current version of Argo workflows