st2
st2 copied to clipboard
many zombie processes forked by st2actionrunner
SUMMARY
many zombie processes forked by st2actionrunner
/opt/stackstorm/st2/bin/python /opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf
root 16338 0.0 0.0 0 0 ? Zs Oct24 0:00 _ [sudo]
root 16343 0.0 0.0 0 0 ? Zs Oct24 0:00 _ [sudo]
root 16362 0.0 0.0 0 0 ? Zs Oct24 0:00 _ [sudo]
STACKSTORM VERSION
2.6.0
run in kubernetes, the start command is '/opt/stackstorm/st2/bin/python /opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf'
Steps to reproduce the problem
I do not how to show how to reproduce the problem; But the actionrunner has many failed task:
| + 660345606d32900001909d4c | email.mistral-network-check | [email protected] | failed (2130s elapsed) | Tue, 26 Mar 2024 22:00:00 | Tue, 26 Mar 2024 22:03:33 UTC |
| | | | | UTC | |
| + 66034d716d32900001909d56 | email.mistral-network-check | [email protected] | failed (3800s elapsed) | Tue, 26 Mar 2024 22:34:25 | Tue, 26 Mar 2024 22:35:03 UTC |
| | | | | UTC | |
| + 66035b816d32900001909d5f | email.mistral-network-check | [email protected] | failed (3800s elapsed) | Tue, 26 Mar 2024 23:34:25 | Tue, 26 Mar 2024 23:35:03 UTC |
| | | | | UTC | |
| + 660369916d32900001909d68 | email.mistral-network-check | [email protected] | failed (4200s elapsed) | Wed, 27 Mar 2024 00:34:25 | Wed, 27 Mar 2024 00:35:07 UTC |
| | | | | UTC | |
| + 660377a16d32900001909d71 | email.mistral-network-check | [email protected] | failed (4400s elapsed) | Wed, 27 Mar 2024 01:34:25 | Wed, 27 Mar 2024 01:35:09 UTC |
| | | | | UTC | |
| + 660385b16d32900001909d7a | email.mistral-network-check | [email protected] | failed (4200s elapsed) | Wed, 27 Mar 2024 02:34:25 | Wed, 27 Mar 2024 02:35:07 UTC |
| | | | | UTC | |
+----------------------------+------------------------------+-------------------+--------------------------+------------------------------+-------------------------------+
Expected Results
no zombie processes
Actual Results
What happened? What output did you get?
has many zombie processes
Are you sure this is the version? this 2.6.0 is too old.
yes 2.6.0, maybe it caused by core.local package, because when i exec action: core.local cmd=<% $.cmd %> timeout=<% $.timeout %>
, if the task is failed or timeout, it will appear one zombie process.
i think if the logic of core.local is not changed, it will also happen in latest version.
Did you set the retry mechanism? I was running the local command without you. It's version 3.8.0
Yes, this is my retry policy:
retry:
count: 5
delay: 300
the task will timeout.
When you talk about zombie processes, are they caused by retry?