postgresql-container postgres process killed with killed -9 command (look like it is bug)

we are running postgresql 12.1 inside the container (we use this image https://catalog.redhat.com/software/containers/rhel8/postgresql-12/5db133bd5a13461646df330b). We are facing strange situation in which some postgre process killed with kill -9 command. I enable the debug in postgresql. First I thought it is OOM kill then I raise the resource in container. It is not anything related to resource as untilization of memory and CPU is way under the allocated resource (like cpu are 8 and memory is 100GB allocated and used only cpu 1 and 37.8GB memory). This issue randomly appear (some time DB is not doing anything). One more strange thing process ID which killed by Kill -9 , it never appear before in logs like if process never is forked. Due to postgres run under container (Open shift envirnoment) we can not get kernal dmesg.

Here is snippet of log and full log also attached

:28:06 EST,session-id=600ccd26.50f5cLOCATION: SharedInvalBackendInit, sinvaladt.c:324 2021-01-23 20:28:07.648 EST [46]=[33197-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:08 EST,session-id=6009272c.2eDEBUG: 00000: recycled write-ahead log file "000000010000014900000021" 2021-01-23 20:28:07.648 EST [46]=[33198-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:08 EST,session-id=6009272c.2eLOCATION: RemoveXlogFile, xlog.c:4106 2021-01-23 20:28:07.805 EST [331612]=[13-1]user=postgres,db=postgres,app=[unknown],client=[local],session-start=2021-01-23 20:28:06 EST,session-id=600ccd26.50f5cDEBUG: 00000: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0 2021-01-23 20:28:07.805 EST [331612]=[14-1]user=postgres,db=postgres,app=[unknown],client=[local],session-start=2021-01-23 20:28:06 EST,session-id=600ccd26.50f5cLOCATION: ShowTransactionStateRec, xact.c:5327 2021-01-23 20:28:08.142 EST [1]=[292190-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:07 EST,session-id=6009272b.1DEBUG: 00000: reaping dead processes 2021-01-23 20:28:08.142 EST [1]=[292191-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:07 EST,session-id=6009272b.1LOCATION: reaper, postmaster.c:2888 2021-01-23 20:28:08.142 EST [1]=[292192-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:07 EST,session-id=6009272b.1DEBUG: 00000: server process (PID 331611) was terminated by signal 9: Killed 2021-01-23 20:28:08.142 EST [1]=[292193-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:07 EST,session-id=6009272b.1LOCATION: LogChildExit, postmaster.c:3686 2021-01-23 20:28:08.142 EST [1]=[292194-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:07 EST,session-id=6009272b.1LOG: 00000: server process (PID 331611) was terminated by signal 9: Killed 2021-01-23 20:28:08.142 EST [1]=[292195-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:07 EST,session-id=6009272b.1LOCATION: LogChildExit, postmaster.c:3686 2021-01-23 20:28:08.142 EST [1]=[292196-1]user=,db=,app=,client=,session-start=2021-01-21 02:03:07 EST,session-id=6009272b.1LOG: 00000: terminating any other active server processes

Feb 04 '21 20:02 jatindersandhu

This is very urgent issue as we are testing new postgresql solution

Feb 04 '21 20:02 jatindersandhu

Hello @jatindersandhu Have you figured out anything since reporting this? We're seeing this too after moving to a new Openshift cluster and I'm quite clueless.

Nov 24 '21 16:11 jpopelka

No I basically shift from operator solution . It is something between Chruncy operator and red hat ..

On Wed, Nov 24, 2021 at 11:34 AM Jiri Popelka @.***> wrote:

Hello @jatindersandhu https://github.com/jatindersandhu Have you figured out anything since reporting this? We're seeing this too after moving to a new Openshift cluster and I'm quite clueless.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sclorg/postgresql-container/issues/396#issuecomment-978040213, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK5VZWZOFVWNW7STFJ346LLUNUHYPANCNFSM4XDQMXJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- JATINDER SINGH SANDHU DATABASE ADMINISTRATOR PHONE NO. - 905 460 7955

Nov 24 '21 17:11 jatindersandhu

@jatindersandhu Thanks for posting this issue/bug. Can you help me, how to debug and/or simulate the error? I am a bit familiar with OpenShift 4 :) Thank you for your help.

Nov 26 '21 12:11 phracek

It's difficult to say what's causing this. We're seeing this on ROSA, i.e. Openshift 4.8 on AWS. With postgresql-12-centos7 and postgresql-13-centos7.

[1] LOG:  server process (PID 24643) was terminated by signal 9: Killed
[1] LOG:  terminating any other active server processes
[24622] WARNING:  terminating connection because of crash of another server process
[24622] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
[24622] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
[1] LOG:  all server processes terminated; reinitializing
[24645] LOG:  database system was interrupted; last known up at 2021-11-26 16:51:16 UTC
[24645] LOG:  database system was not properly shut down; automatic recovery in progress
[24645] LOG:  invalid record length at 0/1DEE328: wanted 24, got 0
[24645] LOG:  redo is not required
[1] LOG:  database system is ready to accept connections

It doesn't matter whether we started with an empty database or imported pg_dumpall-ed data with psql. Giving the pod a lot of resources (gigs of mem, units of CPU) seems to help initially, but the killer strikes back after a while. One thing that helps with our staging database, where there isn't much data, is to run watch pg_isready in a terminal of the pod. I know it's ridiculous and can't be possibly related, but that really scares the s*** out of the killer and it disappears. In our production database, where there's more data, this however didn't help when I was trying. Then when I created a new pod with postgres-13 (the previous one was postgres-12, so I wanted to try v13 as well) and imported the production data, it looked like it's working without any problems. But then, when I tried postgres-13 on staging as well, it showed up again. So as you can see, this is quite indeterminate and there's no reliable reproducer AFAIK.

Nov 26 '21 17:11 jpopelka

Sorry for confusion relating this openshift and chruncy operator. It seems it is resource issue. Kill 9 can be issue by only kubernete as postgres does not use kill -9 to kill the process because it can restart the whole database. I think we need check if there is any liveness probe or readiness probe configured. Some time openshift has bug where probe timeout and kubernate kill the postgres. Solution is for this increase probe timeout like 5 min

Nov 26 '21 18:11 jatindersandhu

Experimenting with the probes was the first thing I did and from what I remember I was seeing the problem even without any probe, so the probe shouldn't be a problem per se.

Nov 26 '21 18:11 jpopelka

I wonder if we should increase logging and analyze those logs, it could reveal how much memory postgres tries to acquire and why

Pity this one isn't merged yet: https://github.com/sclorg/postgresql-container/pull/366/

Nov 29 '21 10:11 TomasTomecek

I wonder if we should increase logging

reporter of this issue did that, see the first comment

Nov 29 '21 17:11 jpopelka

FTR, service file gets this:

# Disable OOM kill on the postmaster
OOMScoreAdjust=-1000

Is this kernel OOM killer enabled in the OpenShift environment?

Nov 30 '21 08:11 praiskup

Good point Pavel, thanks.

Managing Kernel Resources chapter says:

Another approach is to set the process-specific OOM score adjustment value for the postmaster process to -1000,
thereby guaranteeing it will not be targeted by the OOM killer.
The simplest way to do this is to execute
echo -1000 > /proc/self/oom_score_adj
in the postmaster's startup script just before invoking the postmaster.
Note that this action must be done as root, or it will have no effect; so a root-owned startup script is the easiest place to do it.
If you do this, you should also set these environment variables in the startup script before invoking the postmaster:
export PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
export PG_OOM_ADJUST_VALUE=0
These settings will cause postmaster child processes to run with the normal OOM score adjustment of zero,
so that the OOM killer can still target them at need.

and yes, the postgresql.service follows that with

# Disable OOM kill on the postmaster
OOMScoreAdjust=-1000
# ... but allow it still to be effective for child processes
# (note that these settings are ignored by Postgres releases before 9.5)
Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
Environment=PG_OOM_ADJUST_VALUE=0

In K8s/Openshift the OOM score adjustment can't be turned off completely (i.e. set to -1000), but can be set to a minimum if the pod has Guaranteed QoS, i.e. when memory limit and memory request sizes are set the same.

The PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj is also very important, otherwise, the child processes will run with the same OOM score adjustment as the postmaster, which is unwise since the whole point is to ensure that the postmaster has a preferential setting.

I tried that, gave the pod 2Gi mem requested&limit, even set the PG_OOM_ADJUST_VALUE=900 (default is 0), and still, sooner or later I can see [1] LOG: server process (PID 114118) was terminated by signal 9: Killed in the log. Which, as I understand it, says, that it wasn't the main process (postmaster) that was killed, but some other/child process, but still, all connections were terminated.

Also, when I gave the pod a really small amount of memory, like (requests&limits) 60Mi, because I explicitly wanted to see the OOMkiller in action, it was running just fine for quite some time before the killer showed up.

I more and more tend to think that it's the memory overcommit which Kubernetes actively sets to vm.overcommit_memory=1 (always overcommit) but PostgreSQL community recommends it to be set to vm.overcommit_memory=2 (don't overcommit).

Also this article summarizes it very well.

The result is that I still don't have any clue how to make it work reliably :(

Dec 01 '21 16:12 jpopelka

Experimenting with the probes was the first thing I did and from what I remember I was seeing the problem even without any probe, so the probe shouldn't be a problem per se.

Looks like it really was a readiness probe (related) issue after all. After I, once more, tried to remove the probe, the problem has disappeared for good. I have no idea why it hadn't worked the first time I tried it. Maybe, it's the combination of all the changes I did, which is:

use PostgreSQL v13
set PG_OOM_ADJUST_FILE and make sure the pod has Guaranteed QoS
fine tune the config

Dec 13 '21 08:12 jpopelka

postgresql-container postgresql-container copied to clipboard

postgres process killed with killed -9 command (look like it is bug)

postgresql-container
postgresql-container copied to clipboard