matrix-docker-ansible-deploy icon indicating copy to clipboard operation
matrix-docker-ansible-deploy copied to clipboard

Updating error: Postgres fails

Open pfrancks opened this issue 2 years ago • 7 comments

Describe the bug I updated a Debian machine running on bullseye (11.6) and installed the kernel update. After reboot I executed

ansible-playbook -i inventory/hosts setup.yml --tags=setup-all

and the playbook fails at this point:

TASK [galaxy/com.devture.ansible.role.postgres : Execute Postgres managed database initialization SQL file for synapse] *****************************************************************************************
fatal: [matrix.MYDOMAIN.TLD]: FAILED! => changed=true
  cmd:
  - /usr/bin/env
  - docker
  - run
  - --rm
  - --user=998:1000
  - --cap-drop=ALL
  - --env-file=/matrix/postgres/env-postgres-psql
  - --network=matrix
  - --mount
  - type=bind,src=/tmp/matrix-postgres-init-managed-db-user-and-role.sql,dst=/matrix-postgres-init-managed-db-user-and-role.sql,ro
  - --entrypoint=/bin/sh
  - docker.io/postgres:14.6-alpine
  - -c
  - psql -h matrix-postgres --file=/matrix-postgres-init-managed-db-user-and-role.sql
  delta: '0:00:00.571586'
  end: '2023-02-14 20:26:51.209467'
  msg: non-zero return code
  rc: 127
  start: '2023-02-14 20:26:50.637881'
  stderr: 'docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: unable to apply apparmor profile: apparmor failed to apply profile: write /proc/self/attr/apparmor/exec: no such file or directory: unknown.'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>

The Linux kernel is

> cat /proc/version
Linux version 5.10.0-21-amd64 ([email protected]) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.162-1 (2023-01-21)

pfrancks avatar Feb 14 '23 20:02 pfrancks

I've heard of similar problems recently. Seems like it might be a Debian bug with apparmor being installed, but not having any policies by default. You may wish to look into reinstalling apparmor or uninstalling it completely.

spantaleev avatar Feb 14 '23 20:02 spantaleev

This fixed it for me (on Ubuntu, but should be similar)

apt install apparmor apparmor-profiles
service apparmor restart
service docker restart

and retry the deployment

ghost avatar Feb 15 '23 01:02 ghost

Thank you @throwawayay, it fixed the issue in an Ubuntu server 20.04.5

enekonieto avatar Feb 27 '23 16:02 enekonieto

Trying a fresh install on a freshly setup ubuntu 22.04 I am seeing a similar but different error right now:


TASK [galaxy/com.devture.ansible.role.postgres : Create managed database initialization SQL file for synapse] ****************************************************************************
ok: [m-1.acter.global]

TASK [galaxy/com.devture.ansible.role.postgres : Execute Postgres managed database initialization SQL file for synapse] ******************************************************************
fatal: [m-1.acter.global]: FAILED! => changed=true 
  cmd:
  - /usr/bin/env
  - docker
  - run
  - --rm
  - --user=998:1000
  - --cap-drop=ALL
  - --env-file=/matrix/postgres/env-postgres-psql
  - --network=matrix
  - --mount
  - type=bind,src=/tmp/matrix-postgres-init-managed-db-user-and-role.sql,dst=/matrix-postgres-init-managed-db-user-and-role.sql,ro
  - --entrypoint=/bin/sh
  - docker.io/postgres:15.3-alpine
  - -c
  - psql -h matrix-postgres --file=/matrix-postgres-init-managed-db-user-and-role.sql
  delta: '0:00:03.399431'
  end: '2023-06-01 10:54:53.973809'
  msg: non-zero return code
  rc: 2
  start: '2023-06-01 10:54:50.574378'
  stderr: |-
    psql: error: connection to server at "matrix-postgres" (172.18.0.2), port 5432 failed: Host is unreachable
            Is the server running on that host and accepting TCP/IP connections?
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>

PLAY RECAP *******************************************************************************************************************************************************************************

I ran:

ansible-playbook -i inventory/hosts -l m-1.acter.global setup.yml --tags=install-all,ensure-matrix-users-created,start

my machine:

root@ubuntu:~# cat /proc/version
Linux version 5.15.0-73-generic (buildd@bos03-amd64-060) (gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023

Already tried install apparmor and the profiles as mentioned before, also tried removing all that...

gnunicorn avatar Jun 01 '23 11:06 gnunicorn

Tried a totally fresh Ubuntu 22.04 again, just installed apparmor, still no luck. My solution was to switch to Debian 11. Here, after uninstalling apparmor, it went through just fine.

gnunicorn avatar Jun 05 '23 16:06 gnunicorn

Tried a totally fresh Ubuntu 22.04 again, just installed apparmor, still no luck. My solution was to switch to Debian 11. Here, after uninstalling apparmor, it went through just fine.

I can confirm that this bug still exists. Trying debian 12 now. This is my output from Ubuntu 22.04.3 server:

TASK [galaxy/postgres : Execute Postgres managed database initialization SQL file for synapse] ***************************************************************************
fatal: [matrix.domain.com]: FAILED! => changed=true 
  cmd:
  - /usr/bin/env
  - docker
  - run
  - --rm
  - --user=998:1000
  - --cap-drop=ALL
  - --env-file=/matrix/postgres/env-postgres-psql
  - --network=matrix-postgres
  - --mount
  - type=bind,src=/tmp/matrix-postgres-init-managed-db-user-and-role.sql,dst=/matrix-postgres-init-managed-db-user-and-role.sql,ro
  - --entrypoint=/bin/sh
  - docker.io/postgres:16.1-alpine
  - -c
  - psql -h matrix-postgres --file=/matrix-postgres-init-managed-db-user-and-role.sql
  delta: '0:00:24.422965'
  end: '2024-02-02 15:56:11.316515'
  msg: non-zero return code
  rc: 2
  start: '2024-02-02 15:55:46.893550'
  stderr: |-
    psql: error: connection to server at "matrix-postgres" (172.19.0.2), port 5432 failed: Connection refused
            Is the server running on that host and accepting TCP/IP connections?
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>

Kuphi avatar Feb 02 '24 15:02 Kuphi

I have just perform a Postgres upgrade against one of my servers (Rocky Linux v9, though) and it worked, so this is probably not some generic issue affecting everybody.


To everyone experiencing this problem, it may be because your new Postgres (starting with a brand new empty data directory) is slow to start.

As playbook starts it, waits devture_postgres_managed_databases_postgres_start_wait_timeout_seconds seconds (45 on ARM, 15 on amd64) and then tries to prepare various databases and credentials, the first of which being for synapse ("Execute Postgres managed database initialization SQL file for synapse").

Because Postgres is not ready yet, it fails.

You can add something like devture_postgres_managed_databases_postgres_start_wait_timeout_seconds: 45 to your vars.yml file or to your upgrade-postgres command (e.g. ... -e devture_postgres_managed_databases_postgres_start_wait_timeout_seconds=45) and see if that helps.


If it's still failing, before reverting to the old version, you can check the matrix-postgres status (systemctl status matrix-postgres) and logs (journalctl -fu matrix-postgres -n 300).

spantaleev avatar Feb 04 '24 06:02 spantaleev