argo-workflows
argo-workflows copied to clipboard
dag task failed with error msg "converting YAML to JSON: yaml: invalid map key"
Checklist
- [x] Double-checked my configuration.
- [x] Tested using the latest version.
- [x] Used the Emissary executor.
Summary
What happened/what you expected to happen?
hello, i am using dag workflow in argo, i have a task A will output 3 files as output parameters and task B will use these param as it's input parameters. some times both task A and B work fine, but some times the task B will failed and show ”error converting YAML to JSON: yaml: invalid map key: map[interface {}]interface {}“, and the argo server ui ( see as the bellow pic ) shows the input param of task B had no render to the true value but only show the value template. which seems a same problem to https://github.com/argoproj/argo-workflows/issues/5960.
What version are you running?
3.3.8
Diagnostics
Paste the smallest workflow that reproduces the bug. We must be able to run the workflow.
metadata:
name: epl-eftpjz89j
generateName: epl-eftp
namespace: argo
spec:
templates:
- name: diamond
inputs: {}
outputs: {}
metadata: {}
dag:
tasks:
- name: A
template: create-queue
arguments: {}
- name: B
template: run-label
arguments:
parameters:
- name: sc_job
value: '{{tasks.A.outputs.parameters.sc_job}}'
- name: para_job
value: '{{tasks.A.outputs.parameters.para_job}}'
depends: createqueuet
- name: create-queue
inputs: {}
outputs:
parameters:
- name: sc_job
valueFrom:
path: /sc_job
- name: p_class
valueFrom:
path: /p_class
- name: para_job
valueFrom:
path: /para_job
metadata: {}
container:
name: ''
image: 'ftp-autolabeling-dask:main'
command:
- python3.7
- /app/src/workflow/util/create_task_queue.py
resources: {}
imagePullPolicy: Always
- name: run-label
inputs:
parameters:
- name: sc_job
- name: para_job
outputs: {}
metadata: {}
resource:
action: create
manifest: |
apiVersion: batch/v1
kind: Job
metadata:
generateName: epl-{{workflow.parameters.task_name}}-
spec:
ttlSecondsAfterFinished: 259200
backoffLimit: 10000
completions: {{ inputs.parameters.sc_job }}
parallelism: {{ inputs.parameters.para_job }}
template:
metadata:
annotations:
creationTimestamp: null
spec:
nodeSelector:
cac_mode: cpu
containers:
- name: container-p8gnk7
image: 'ftp-autolabeling-dask:main'
command:
- bash
args:
- /app/resource/script/start_epl_dis_task_runner.sh
- "{{workflow.parameters.batch_size}}"
- "{{workflow.parameters.batch_para_num}}"
- {{workflow.parameters.task_name}}
- {{workflow.parameters.branch}}
imagePullPolicy: Always
restartPolicy: Never
dnsPolicy: ClusterFirst
serviceAccountName: default
serviceAccount: default
securityContext: {}
schedulerName: default-scheduler
setOwnerReference: true
successCondition: 'status.succeeded == {{ inputs.parameters.sc_job }}'
failureCondition: status.failed > 10000
entrypoint: diamond
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
Thanks for submitting. Definitely seems like a bug if the behavior is inconsistent from run to run.
@thinkhoon Can you check the A task output during the failure? looks like A output is invalid to marshal it.
@thinkhoon Can you check the A task output during the failure? looks like A output is invalid to marshal it. hi sara, the output is absolutely right because the k8s job ( which need the output param of task A to define the success pods num ) is created correctly and run fine.
Fix this depends: createqueuet
, it is wrong.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.
This issue has been closed due to inactivity. Feel free to re-open if you still encounter this issue.