oom kill runltp parent process
When the oom testcase is running, the runltp process is killed. As a result, subsequent testcasess cannot be executed. error log: Killing process 97867 (runltp) with signal SIGTERM
I also hit a similar problem because of a systemd bug. Which systemd version does your system use?
I also hit a similar problem because of a systemd bug. Which systemd version does your system use?
hello, systemd version is v243*:https://github.com/systemd/systemd/tree/v243
It seems not a same problem. Do you have full dmesg from this run?
It seems not a same problem. Do you have full dmesg from this run?

It seems it also killed ssh-agent process. I guess you ssh to this machine and then run ltp test case, then session was closed. Is it right? Which linux distribution version do you use?
@xuyang0410 hi, i also encountered same problem. runltp was killed by oom-killer when oom02 was executed.
May 16 21:02:24 localhost kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
May 16 21:02:24 localhost kernel: [ 1211] 0 1211 544 6 327680 107 -1000 systemd-udevd
May 16 21:02:24 localhost kernel: [ 1501] 81 1501 517 35 393216 44 -900 dbus-daemon
...
May 16 21:02:24 localhost kernel: [ 98679] 0 98679 3549 2 458752 29 0 runltp
May 16 21:02:24 localhost kernel: [ 98867] 0 98867 56 0 393216 24 0 ltp-pan
May 16 21:02:24 localhost kernel: [2343580] 0 2343580 1014 124 393216 78 -250 systemd-journal
...
May 16 21:02:24 localhost kernel: [ 912894] 0 912894 52 0 393216 14 -1000 oom02
May 16 21:02:24 localhost kernel: [ 912895] 0 912895 52 0 393216 17 -1000 oom02
May 16 21:02:24 localhost kernel: [ 913199] 0 913199 4465 38 393216 122 0 sssd_be
May 16 21:02:24 localhost kernel: [ 913201] 997 913201 39822 84 786432 78 0 polkitd
May 16 21:02:24 localhost kernel: [ 913290] 0 913290 37091 82 720896 395 0 Xorg
May 16 21:02:24 localhost kernel: [ 913299] 0 913299 4773 41 327680 85 0 sssd_nss
May 16 21:02:24 localhost kernel: [ 913493] 0 913493 17180 313 524288 379 0 tuned
May 16 21:02:24 localhost kernel: [ 913497] 0 913497 6504 2 458752 234 0 udisksd
May 16 21:02:24 localhost kernel: [ 913866] 987 913866 298 0 327680 43 0 dbus-launch
May 16 21:02:24 localhost kernel: [ 913867] 0 913867 7593 23 458752 154 0 NetworkManager
May 16 21:02:24 localhost kernel: [ 913893] 987 913893 475 0 327680 47 0 dbus-daemon
May 16 21:02:24 localhost kernel: [ 914005] 987 914005 10302 1 458752 778 0 onboard
May 16 21:02:24 localhost kernel: [ 914008] 987 914008 4825 0 393216 63 0 at-spi-bus-laun
May 16 21:02:24 localhost kernel: [ 914013] 987 914013 473 0 393216 52 0 dbus-daemon
May 16 21:02:24 localhost kernel: [ 914139] 0 914139 3104755 1831937 18481152 75 0 oom02
May 16 21:02:24 localhost kernel: Out of memory: Kill process 914139 (oom02) score 712 or sacrifice child
May 16 21:02:24 localhost kernel: Killed process 914139 (oom02) total-vm:198704320kB, anon-rss:117243072kB, file-rss:640kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: oom_reaper: reaped process 914139 (oom02), now anon-rss:117261376kB, file-rss:0kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: oom02 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
First, oom-killer kills oom02, and reclaims its memory, but it fails. Becuase the memory was locked.
The following is the trace log i added to the kernel:
oom_reaper-57 [007] .... 126.063581: __oom_reap_task_mm: gh: vma is anon:1048691, range=65536
oom_reaper-57 [007] .... 126.063581: __oom_reap_task_mm: gh: vma is anon:1048691, range=196608
oom_reaper-57 [007] .... 126.063582: __oom_reap_task_mm: gh: vma continue: 1056883, range:3221225472
oom_reaper-57 [007] .... 126.063583: __oom_reap_task_mm: gh: vma is anon:112, range=65536
oom_reaper-57 [007] .... 126.063584: __oom_reap_task_mm: gh: vma is anon:1048691, range=8388608
vma continue: 1056883, range:3221225472 is the memory that can not reclaims. 1057883(0x102073) is vma->vm_flags, it has VM_LOCKED` flag,indicating that the memory is in use and cannot be reclaimed. It will be released when it is no longer used.
Next, oom-killer tries to kill other processes to gain memory. Unfortunately, runltp was killed,
May 16 21:02:24 localhost kernel: [ 914008] 987 914008 4825 0 393216 65 0 at-spi-bus-laun
May 16 21:02:24 localhost kernel: [ 914013] 987 914013 473 0 393216 52 0 dbus-daemon
May 16 21:02:24 localhost kernel: [ 914015] 987 914015 2583 0 458752 78 0 at-spi2-registr
May 16 21:02:24 localhost kernel: [ 914139] 0 914139 3104755 1832837 18481152 0 0 oom02
May 16 21:02:24 localhost kernel: Out of memory: Kill process 913199 (sssd_be) score 0 or sacrifice child
May 16 21:02:24 localhost kernel: Killed process 913199 (sssd_be) total-vm:285760kB, anon-rss:640kB, file-rss:0kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: oom_reaper: reaped process 913199 (sssd_be), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: Out of memory: Kill process 912518 (sssd) score 0 or sacrifice child
May 16 21:02:24 localhost kernel: Killed process 912624 (sssd_pam) total-vm:272192kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: oom_reaper: reaped process 912624 (sssd_pam), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: Out of memory: Kill process 912518 (sssd) score 0 or sacrifice child
... // many processes killed
May 16 21:02:24 localhost kernel: Out of memory: Kill process 98679 (runltp) score 0 or sacrifice child
May 16 21:02:24 localhost kernel: Killed process 98867 (ltp-pan) total-vm:3584kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: oom_reaper: reaped process 98867 (ltp-pan), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: Out of memory: Kill process 98679 (runltp) score 0 or sacrifice child
May 16 21:02:24 localhost kernel: Killed process 98679 (runltp) total-vm:227136kB, anon-rss:0kB, file-rss:128kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: oom_reaper: reaped process 98679 (runltp), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
May 16 21:02:24 localhost kernel: Out of memory: Kill process 1755 (atd) score 0 or sacrifice child
oom02 set the oom_score_adj of parent-oom02 to -1000, prevent being killed by oom-killer, and set oom_score_adj of the child-oom02 to 0.
So, should we set the default oom_score_adj of the runltp to -1000 too?
