argo-workflows
argo-workflows copied to clipboard
Offloading large workflows raises db syntax errors that prevent the offloading
Pre-requisites
- [X] I have double-checked my configuration
- [ ] I can confirm the issue exists when I tested with
:latest - [X] I have searched existing issues and could not find a match for this bug
- [x] I'd like to contribute the fix myself (see contributing guide)
What happened/what did you expect to happen?
I have tried to enable nodeStatusOffload for the workflow-controller, and configured the postgresql under the persistence section in the workflow-controller-configuration.
The connection is successful, as there're migration logs in the controller.
But when it reaches the offloading part, it crashes with this error:
workflow is longer than maximum allowed size. compressed size 1053022 > maxSize 1048576. Tried to offload but encountered an error: pq: syntax error at or near '('
I also tried doing this with mysql instead of postgresql but got another syntax error:
workflow is longer than maximum allowed size. compressed size 1053022 > maxSize 1048576. Tried to offload but encountered an error: ERROR 1064: You have an error in your SQL syntax; check the manual that corresponds to your MYSQL server version for the right syntax to use near '('clustername', 'namespace', 'uid', 'nodes', 'version') VALUES (?' at line 2.
I am trying to make my Argo Server run very large workflows, and to do so I generated a sample DAG with demo steps that simply print, and tried to run 700 of those without any dependency between the tasks.
Version
v3.4.6
Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
I simply ran a workflow that make some prints, not anything complex.
And loaded my DAG with 700 steps like this.
Logs from the workflow controller
kubectl logs -n argo deploy/workflow-controller | grep ${workflow}
Logs from in your workflow's wait container
kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded
- [x] I can confirm the issue exists when I tested with
:latest
v3.4.6
You're on a pretty old version, so I would suggest trying with :latest or 3.4.16 at least.
Please fill out the issue template accurately, it asks for :latest usage very intentionally.
Looking at the diff v3.4.6..v3.4.16, #10887 in particular seems related (although the error is different)
I’ll be able to test the “:latest” by Sunday but what other information can I provide you with?
This issue has been automatically marked as stale because it has not had recent activity and needs more information. It will be closed if no further activity occurs.
This issue has been closed due to inactivity and lack of information. If you still encounter this issue, please add the requested information and re-open.