matrix-docker-ansible-deploy
matrix-docker-ansible-deploy copied to clipboard
Updating error: Postgres fails
Describe the bug I updated a Debian machine running on bullseye (11.6) and installed the kernel update. After reboot I executed
ansible-playbook -i inventory/hosts setup.yml --tags=setup-all
and the playbook fails at this point:
TASK [galaxy/com.devture.ansible.role.postgres : Execute Postgres managed database initialization SQL file for synapse] *****************************************************************************************
fatal: [matrix.MYDOMAIN.TLD]: FAILED! => changed=true
cmd:
- /usr/bin/env
- docker
- run
- --rm
- --user=998:1000
- --cap-drop=ALL
- --env-file=/matrix/postgres/env-postgres-psql
- --network=matrix
- --mount
- type=bind,src=/tmp/matrix-postgres-init-managed-db-user-and-role.sql,dst=/matrix-postgres-init-managed-db-user-and-role.sql,ro
- --entrypoint=/bin/sh
- docker.io/postgres:14.6-alpine
- -c
- psql -h matrix-postgres --file=/matrix-postgres-init-managed-db-user-and-role.sql
delta: '0:00:00.571586'
end: '2023-02-14 20:26:51.209467'
msg: non-zero return code
rc: 127
start: '2023-02-14 20:26:50.637881'
stderr: 'docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: unable to apply apparmor profile: apparmor failed to apply profile: write /proc/self/attr/apparmor/exec: no such file or directory: unknown.'
stderr_lines: <omitted>
stdout: ''
stdout_lines: <omitted>
The Linux kernel is
> cat /proc/version
Linux version 5.10.0-21-amd64 ([email protected]) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.162-1 (2023-01-21)
I've heard of similar problems recently. Seems like it might be a Debian bug with apparmor being installed, but not having any policies by default. You may wish to look into reinstalling apparmor or uninstalling it completely.
This fixed it for me (on Ubuntu, but should be similar)
apt install apparmor apparmor-profiles
service apparmor restart
service docker restart
and retry the deployment
Thank you @throwawayay, it fixed the issue in an Ubuntu server 20.04.5
Trying a fresh install on a freshly setup ubuntu 22.04 I am seeing a similar but different error right now:
TASK [galaxy/com.devture.ansible.role.postgres : Create managed database initialization SQL file for synapse] ****************************************************************************
ok: [m-1.acter.global]
TASK [galaxy/com.devture.ansible.role.postgres : Execute Postgres managed database initialization SQL file for synapse] ******************************************************************
fatal: [m-1.acter.global]: FAILED! => changed=true
cmd:
- /usr/bin/env
- docker
- run
- --rm
- --user=998:1000
- --cap-drop=ALL
- --env-file=/matrix/postgres/env-postgres-psql
- --network=matrix
- --mount
- type=bind,src=/tmp/matrix-postgres-init-managed-db-user-and-role.sql,dst=/matrix-postgres-init-managed-db-user-and-role.sql,ro
- --entrypoint=/bin/sh
- docker.io/postgres:15.3-alpine
- -c
- psql -h matrix-postgres --file=/matrix-postgres-init-managed-db-user-and-role.sql
delta: '0:00:03.399431'
end: '2023-06-01 10:54:53.973809'
msg: non-zero return code
rc: 2
start: '2023-06-01 10:54:50.574378'
stderr: |-
psql: error: connection to server at "matrix-postgres" (172.18.0.2), port 5432 failed: Host is unreachable
Is the server running on that host and accepting TCP/IP connections?
stderr_lines: <omitted>
stdout: ''
stdout_lines: <omitted>
PLAY RECAP *******************************************************************************************************************************************************************************
I ran:
ansible-playbook -i inventory/hosts -l m-1.acter.global setup.yml --tags=install-all,ensure-matrix-users-created,start
my machine:
root@ubuntu:~# cat /proc/version
Linux version 5.15.0-73-generic (buildd@bos03-amd64-060) (gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023
Already tried install apparmor and the profiles as mentioned before, also tried removing all that...
Tried a totally fresh Ubuntu 22.04 again, just installed apparmor, still no luck. My solution was to switch to Debian 11. Here, after uninstalling apparmor, it went through just fine.
Tried a totally fresh Ubuntu 22.04 again, just installed apparmor, still no luck. My solution was to switch to Debian 11. Here, after uninstalling
apparmor, it went through just fine.
I can confirm that this bug still exists. Trying debian 12 now. This is my output from Ubuntu 22.04.3 server:
TASK [galaxy/postgres : Execute Postgres managed database initialization SQL file for synapse] ***************************************************************************
fatal: [matrix.domain.com]: FAILED! => changed=true
cmd:
- /usr/bin/env
- docker
- run
- --rm
- --user=998:1000
- --cap-drop=ALL
- --env-file=/matrix/postgres/env-postgres-psql
- --network=matrix-postgres
- --mount
- type=bind,src=/tmp/matrix-postgres-init-managed-db-user-and-role.sql,dst=/matrix-postgres-init-managed-db-user-and-role.sql,ro
- --entrypoint=/bin/sh
- docker.io/postgres:16.1-alpine
- -c
- psql -h matrix-postgres --file=/matrix-postgres-init-managed-db-user-and-role.sql
delta: '0:00:24.422965'
end: '2024-02-02 15:56:11.316515'
msg: non-zero return code
rc: 2
start: '2024-02-02 15:55:46.893550'
stderr: |-
psql: error: connection to server at "matrix-postgres" (172.19.0.2), port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
stderr_lines: <omitted>
stdout: ''
stdout_lines: <omitted>
I have just perform a Postgres upgrade against one of my servers (Rocky Linux v9, though) and it worked, so this is probably not some generic issue affecting everybody.
To everyone experiencing this problem, it may be because your new Postgres (starting with a brand new empty data directory) is slow to start.
As playbook starts it, waits devture_postgres_managed_databases_postgres_start_wait_timeout_seconds seconds (45 on ARM, 15 on amd64) and then tries to prepare various databases and credentials, the first of which being for synapse ("Execute Postgres managed database initialization SQL file for synapse").
Because Postgres is not ready yet, it fails.
You can add something like devture_postgres_managed_databases_postgres_start_wait_timeout_seconds: 45 to your vars.yml file or to your upgrade-postgres command (e.g. ... -e devture_postgres_managed_databases_postgres_start_wait_timeout_seconds=45) and see if that helps.
If it's still failing, before reverting to the old version, you can check the matrix-postgres status (systemctl status matrix-postgres) and logs (journalctl -fu matrix-postgres -n 300).