postgresql-container icon indicating copy to clipboard operation
postgresql-container copied to clipboard

Upgrade flow for images RHEL/CentOS 8+

Open eifrach opened this issue 10 months ago • 17 comments

Creating flow for upgrading postgresSQL

  1. add install_weak_deps=false to installation which reduce the image size
  2. install postgresql-upgrade package for the upgrade

eifrach avatar Apr 11 '24 12:04 eifrach

[test]

eifrach avatar Apr 11 '24 12:04 eifrach

Overall, it looks good. Thank you for your contribution. We will need to generalize it and implement it in the master branch. https://github.com/sclorg/postgresql-container/blob/master/src/root/usr/share/container-scripts/postgresql/common.sh

fila43 avatar Apr 11 '24 16:04 fila43

Overall, it looks good. Thank you for your contribution. We will need to generalize it and implement it in the master branch. https://github.com/sclorg/postgresql-container/blob/master/src/root/usr/share/container-scripts/postgresql/common.sh

I've issue fixes for all the remarks once they approved I'll create a PR with changes to all common.sh files

eifrach avatar Apr 14 '24 13:04 eifrach

It makes a good impression on me. Please implement the changes into /src directory. Once it is there, we will check and adjust the compatibility for rhel7/centos7 vs rhel8+ versions, or we can wait for the EOL of centos7/rhel7 images. Also, there should be the prepared upgrade test that needs to be enabled for rhel8+ containers.

fila43 avatar Apr 16 '24 20:04 fila43

[test]

eifrach avatar Apr 18 '24 12:04 eifrach

cc @danmanor

eifrach avatar Apr 18 '24 12:04 eifrach

LGTM, but there is dist-gen issue

fila43 avatar Apr 24 '24 19:04 fila43

@phracek would it be possible to enable this test also for RHEL8+ ? https://github.com/sclorg/postgresql-container/blob/d0cecca7766a6150489228dc2670143bf73a3997/test/run_test#L682

fila43 avatar Apr 24 '24 19:04 fila43

[test]

fila43 avatar Apr 24 '24 19:04 fila43

Hi,

Thanks for the PR! I've got a couple of general commentaries about PostgreSQL major version upgrade, and although they may not necessarily be related to the PR, I reckon it's a good idea to collect them here. As an outside person I could be missing some details of this project, please let me know if that's the case.

  • POSTGRESQL_UPGRADE seems to allow only --link and --copy, but since PG 11 there is also --clone option, which somewhat combines best of both worlds, but requires newer Linux kernel versions.

  • The docs mention --link option as an unsafe, but the old cluster should not be used only of the new cluster was started. If it wasn't the case, the old cluster was unmodified and should be safe to start.

  • It seems that it's possible to provide custom initdb options via POSTGRESQL_UPGRADE_INITDB_OPTIONS, but some of them must be the same as the old cluster (collate, encoding, data checksums, wal segment size). It sounds a bit unreliable to leave it to the container user, could those options be enforced?

  • It's worth running pg_upgrade --check to verify compatibility.

  • Maybe a good idea to add -j to upgrade_cmd for additional parallelism.

  • Usually triggering a vacuum after the upgrade to rebuild statistics is a good idea as well.

  • More open-ended question, what do you think about those situations where some coordination is needed? E.g. when upgrading a replica, most likely rsync has to be used in coordination with the primary upgrade. Or upgrading extensions, which is not handled via pg_upgrade either.

    It's probably beyond what you want to achieve within a container, but it's at least worth mentioning in the documentation, right?

https://www.postgresql.org/docs/current/pgupgrade.html

erthalion avatar Apr 26 '24 12:04 erthalion

[test]

fila43 avatar Apr 30 '24 13:04 fila43

[test]

fila43 avatar Apr 30 '24 14:04 fila43

[test]

fila43 avatar May 06 '24 09:05 fila43

[test]

fila43 avatar May 06 '24 13:05 fila43

@phracek @zmiklank any idea about c9s failures? RHEL9 passed but c9s failed

fila43 avatar May 06 '24 17:05 fila43

@phracek @zmiklank any idea about c9s failures? RHEL9 passed but c9s failed

This is reproducible also in the master branch - so not seem to be related to this PR. Seems like a segfault (double free) from pg_isready binary.

Aren't there any differences in c9s and rhel9 binaries?

zmiklank avatar May 07 '24 07:05 zmiklank

So this may be an postgresql[1] issue which is present only when using newer openssl (3.2.1) that got into c9s, but not into rhel9 yet. postgresql upstream patch seeems to exist already, more info is in the linked mailing thread[1].

[1] https://www.postgresql.org/message-id/CAN55FZ1eDDYsYaL7mv%2BoSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ%40mail.gmail.com

zmiklank avatar May 07 '24 09:05 zmiklank

sorry for the delay, I was on PTO looking into the failures.

eifrach avatar May 28 '24 08:05 eifrach

sorry for the delay, I was on PTO looking into the failures.

No problem at all. Just note that c8s images will be deprecated on May the 31th. So no need to work further on them I guess. (This PR needs to merged first: https://github.com/sclorg/postgresql-container/pull/567)

zmiklank avatar May 28 '24 08:05 zmiklank

[test]

fila43 avatar May 29 '24 21:05 fila43

I tested RHEL8 failed cases locally and it works, c9s fails are known issues so I would merge it. @eifrach could you please confirm my observation to be sure before merge?

fila43 avatar Jun 18 '24 09:06 fila43

I tested RHEL8 failed cases locally and it works, c9s fails are known issues so I would merge it. @eifrach could you please confirm my observation to be sure before merge?

yes it works for me - also I've tested the data migration

eifrach avatar Jun 18 '24 11:06 eifrach