tds_fdw icon indicating copy to clipboard operation
tds_fdw copied to clipboard

Implement support for core dump creation and back trace extraction in CI

Open GeoffMontee opened this issue 4 years ago • 1 comments

It might be a nice improvement to our CI if were able to create core dumps when PostgreSQL crashes and also able to automatically extract back traces from them. This would probably make it easier to debug crashes like the one we saw in #213.

This idea was originally mentioned in #214 here: https://github.com/tds-fdw/tds_fdw/pull/214#issuecomment-541279407

This would probably require at least the changes listed below.

Changes Required in ci-build

  • We would have to make sure that ci-build compiles tds_fdw with the -ggdb option specified in PG_CPPFLAGS, so that tds_fdw is built with debugging symbols.

Changes Required in ci-setup

  • We would have to make sure that ci-setup installs debuginfo packages for PostgreSQL.

For example:

sudo yum install postgresql12-debuginfo
  • We would have to make sure that ci-setup grants unlimited size core dumps to the PostgreSQL process.

For OSes that use systemd, that would probably look like this:

sudo tee /etc/systemd/system/postgresql-12.service.d/limitcore.conf <<EOF
[Service]

LimitCORE=infinity
EOF
sudo systemctl daemon-reload

For other OSes, that would probably look like this:

sudo tee /etc/security/limits.conf.d/postgres_core.conf <<EOF
postgres soft core unlimited
postgres hard core unlimited
EOF
  • We would also have to make sure that ci-setup sets up some other parameters related to core dumps.

For example:

sudo tee /etc/sysctl.d/postgres_core.conf <<EOF
# Set the path to the core dumps
kernel.core_pattern = /core_dumps

# Add the PID to the end of the file name
kernel.core_uses_pid = 1

# Allow setuid processes to dump core. Is this necessary for Postgres?
fs.suid_dumpable = 2
EOF
  • We would also make sure that ci-setup creates any paths that we depend on.

For example:

mkdir /core_dumps
chmod 0777 /core_dumps

Changes Required in tds_fdw

  • We would have to change tests/postgresql-tests.py in tds_fdw to make it detect PostgreSQL crashes. Maybe it could scan the PostgreSQL log for lines like this?:
2019-10-02 02:28:48.702 UTC [50] LOG:  server process (PID 292) was terminated by signal 11: Segmentation fault
  • If tests/postgresql-tests.py detects a PostgreSQL crash, then it would have to get the value of kernel.core_pattern:

For example:

sysctl kernel.core_pattern
  • When tests/postgresql-tests.py has the value of kernel.core_pattern, it could check the path for core dumps.

  • When tests/postgresql-tests.py finds a core dump, it could get all backtraces from it.

For example:

sudo gdb --batch --eval-command="thread apply all bt full" $(which postmaster) ${core_file_path}

GeoffMontee avatar Oct 14 '19 20:10 GeoffMontee

I will start working on this as soon as we get rid of CentOS6 for the testing (so we can only have systemd, which is already used by Ubuntu 18.04).

I will also need how this would work inside the docker containers we use. Most probably no big deal, but you never know :-)

juliogonzalez avatar Nov 19 '19 23:11 juliogonzalez