st2 icon indicating copy to clipboard operation
st2 copied to clipboard

many zombie processes forked by st2actionrunner

Open magiceses opened this issue 10 months ago • 5 comments

SUMMARY

many zombie processes forked by st2actionrunner

/opt/stackstorm/st2/bin/python /opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf
root 16338 0.0 0.0 0 0 ? Zs Oct24 0:00 _ [sudo]
root 16343 0.0 0.0 0 0 ? Zs Oct24 0:00 _ [sudo]
root 16362 0.0 0.0 0 0 ? Zs Oct24 0:00 _ [sudo]

STACKSTORM VERSION

2.6.0

run in kubernetes, the start command is '/opt/stackstorm/st2/bin/python /opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf'

Steps to reproduce the problem

I do not how to show how to reproduce the problem; But the actionrunner has many failed task:

| + 660345606d32900001909d4c | email.mistral-network-check  | [email protected] | failed (2130s elapsed) | Tue, 26 Mar 2024 22:00:00    | Tue, 26 Mar 2024 22:03:33 UTC |
|                            |                              |                   |                          | UTC                          |                               |
| + 66034d716d32900001909d56 | email.mistral-network-check  | [email protected] | failed (3800s elapsed)  | Tue, 26 Mar 2024 22:34:25    | Tue, 26 Mar 2024 22:35:03 UTC |
|                            |                              |                   |                          | UTC                          |                               |
| + 66035b816d32900001909d5f | email.mistral-network-check  | [email protected] | failed (3800s elapsed)  | Tue, 26 Mar 2024 23:34:25    | Tue, 26 Mar 2024 23:35:03 UTC |
|                            |                              |                   |                          | UTC                          |                               |
| + 660369916d32900001909d68 | email.mistral-network-check  | [email protected] | failed (4200s elapsed)  | Wed, 27 Mar 2024 00:34:25    | Wed, 27 Mar 2024 00:35:07 UTC |
|                            |                              |                   |                          | UTC                          |                               |
| + 660377a16d32900001909d71 | email.mistral-network-check  | [email protected] | failed (4400s elapsed)  | Wed, 27 Mar 2024 01:34:25    | Wed, 27 Mar 2024 01:35:09 UTC |
|                            |                              |                   |                          | UTC                          |                               |
| + 660385b16d32900001909d7a | email.mistral-network-check  | [email protected] | failed (4200s elapsed)  | Wed, 27 Mar 2024 02:34:25    | Wed, 27 Mar 2024 02:35:07 UTC |
|                            |                              |                   |                          | UTC                          |                               |
+----------------------------+------------------------------+-------------------+--------------------------+------------------------------+-------------------------------+

Expected Results

no zombie processes

Actual Results

What happened? What output did you get?

has many zombie processes

magiceses avatar Mar 27 '24 02:03 magiceses

Are you sure this is the version? this 2.6.0 is too old.

chain312 avatar Mar 27 '24 03:03 chain312

yes 2.6.0, maybe it caused by core.local package, because when i exec action: core.local cmd=<% $.cmd %> timeout=<% $.timeout %> , if the task is failed or timeout, it will appear one zombie process. i think if the logic of core.local is not changed, it will also happen in latest version.

magiceses avatar Mar 28 '24 01:03 magiceses

Did you set the retry mechanism? I was running the local command without you. It's version 3.8.0

chain312 avatar Mar 28 '24 03:03 chain312

Yes, this is my retry policy:

retry:
  count: 5
  delay: 300

the task will timeout.

magiceses avatar Mar 28 '24 03:03 magiceses

When you talk about zombie processes, are they caused by retry?

chain312 avatar Mar 28 '24 04:03 chain312