ReBench icon indicating copy to clipboard operation
ReBench copied to clipboard

Denoise does not terminate subprocesses on timeout

Open smarr opened this issue 4 years ago • 3 comments

Currently, it seems like the benchmark processes might survive denoise when it is terminated because of a timeout.

Needs to be further investigated. There might be multiple options:

  1. make sure denoise properly terminates all subprocesses on exit
  2. handle timeout in denoise

Seems like option 1 is most preferable.

The issue should be confirmed and ideally tested with a test.

smarr avatar Apr 13 '21 13:04 smarr

@OctaveLarose looks like there's an issue for that already.

The following processes where running after timeouts:

root      382184  0.0  0.0  11884  4616 ?        S    13:48   0:00 sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc WhileLoop 55 9000
root      382185 99.9  0.3 171708 165532 ?       R<   13:48 146:50 som-bc-jit BenchmarkHarness.som --gc WhileLoop 55 9000
gitlab-+  399485  0.0  0.0   2620   608 ?        S    15:28   0:00 /bin/sh -c sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc Loop 55  10000
root      399486  0.0  0.0  11876  4572 ?        S    15:28   0:00 sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc Loop 55 10000
root      399487 99.9  0.2 143484 137920 ?       R<   15:28  46:50 som-bc-jit BenchmarkHarness.som --gc Loop 55 10000
gitlab-+ 1644381  0.0  0.0 196636 34244 ?        Sl   13:47   0:01 /usr/bin/python3 /usr/local/bin/rebench --experiment=CI ID 13644 --branch=stack_caching -c rebench.conf m:yuria3
root     1652011  0.0  0.0  11876  4532 ?        S    14:23   0:00 sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc FieldLoop 55 900
gitlab-+ 1669472  0.0  0.0   2620   548 ?        S    16:03   0:00 /bin/sh -c sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc Permute 55  1500
root     1669473  0.0  0.0  11884  4528 ?        S    16:03   0:00 sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc Permute 55 150
root     1652011  0.0  0.0  11876  4532 ?        S    14:23   0:00 sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc FieldLoop 55 900
root     1652012 99.9  0.2 109984 104376 ?       R<   14:23 115:52 som-bc-jit BenchmarkHarness.som --gc FieldLoop 55 900
gitlab-+ 1669472  0.0  0.0   2620   548 ?        S    16:03   0:00 /bin/sh -c sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc Permute 55  1500
root     1669473  0.0  0.0  11884  4528 ?        S    16:03   0:00 sudo rebench-denoise exec -- som-bc-jit BenchmarkHarness.som --gc Permute 55 1500
root     1669474 99.8  0.2 143036 137748 ?       R<   16:03  15:37 som-bc-jit BenchmarkHarness.som --gc Permute 55 1500

My guess, the first issue is that the rebench process can't seen a signal to the process running as root.

smarr avatar Nov 27 '22 15:11 smarr

A simple test:

root$ sleep 100000

user$ kill -9 pidOfSleep
kill: kill pidOfSleep failed: operation not permitted

So, I think, I'll add a feature to rebench-denoise to send a kill signal as root.

smarr avatar Nov 27 '22 15:11 smarr

Ok, first attempt: https://github.com/smarr/ReBench/tree/timeouts-and-signals

I still want to add signal handling to automatically also look for all child processes before exiting based on a signal.

smarr avatar Nov 27 '22 17:11 smarr