matrix-docker-ansible-deploy
matrix-docker-ansible-deploy copied to clipboard
Nginx cannot start after reboot server
Playbook Configuration:
My vars.yml
file looks like this:
---
# The bare domain name which represents your Matrix identity.
# Matrix user ids for your server will be of the form (`@user:<matrix-domain>`).
#
# Note: this playbook does not touch the server referenced here.
# Installation happens on another server ("matrix.<matrix-domain>").
#
# If you've deployed using the wrong domain, you'll have to run the Uninstalling step,
# because you can't change the Domain after deployment.
#
# Example value: example.com
matrix_domain: atunemic.cn
# The Matrix homeserver software to install.
# See `roles/matrix-base/defaults/main.yml` for valid options.
matrix_homeserver_implementation: dendrite
# A secret used as a base, for generating various other secrets.
# You can put any string here, but generating a strong one is preferred (e.g. `pwgen -s 64 1`).
matrix_homeserver_generic_secret_key: '***'
# This is something which is provided to Let's Encrypt when retrieving SSL certificates for domains.
#
# In case SSL renewal fails at some point, you'll also get an email notification there.
#
# If you decide to use another method for managing SSL certificates (different than the default Let's Encrypt),
# you won't be required to define this variable (see `docs/configuring-playbook-ssl-certificates.md`).
#
# Example value: [email protected]
matrix_ssl_lets_encrypt_support_email: '[email protected]'
# A Postgres password to use for the superuser Postgres user (called `matrix` by default).
#
# The playbook creates additional Postgres users and databases (one for each enabled service)
# using this superuser account.
matrix_postgres_connection_password: '***'
matrix_synapse_enable_registration: true
matrix_synapse_registration_requires_token: true
matrix_synapse_registrations_require_3pid: 'email'
matrix_prometheus_enabled: false
matrix_prometheus_node_exporter_enabled: false
matrix_grafana_enabled: false
matrix_grafana_anonymous_access: false
# This has no relation to your Matrix user id. It can be any username you'd like.
# Changing the username subsequently won't work.
matrix_grafana_default_admin_user: "kevin"
# Changing the password subsequently won't work.
matrix_grafana_default_admin_password: "***"
matrix_synapse_admin_enabled: flase
matrix_synapse_ext_password_provider_shared_secret_auth_enabled: false
matrix_synapse_ext_password_provider_shared_secret_auth_shared_secret: ***
matrix_bot_mjolnir_enabled: true
matrix_bot_mjolnir_access_token: "***"
matrix_bot_mjolnir_management_room: "!AVjqHyfcl6BsDRTO:atunemic.cn"
matrix_synapse_ext_spam_checker_mjolnir_antispam_enabled: true
matrix_synapse_ext_spam_checker_mjolnir_antispam_config_block_invites: false
matrix_synapse_ext_spam_checker_mjolnir_antispam_config_block_messages: false
matrix_synapse_ext_spam_checker_mjolnir_antispam_config_block_usernames: false
matrix_synapse_ext_spam_checker_mjolnir_antispam_config_ban_lists: []
matrix_mautrix_telegram_enabled: false
matrix_mautrix_telegram_api_id: 9609852
matrix_mautrix_telegram_api_hash: ***
matrix_mautrix_telegram_bot_token: ***
matrix_mautrix_telegram_configuration_extension_yaml: |
bridge:
permissions:
'*': relaybot
'@kevin_liu:atunemic.cn': admin
matrix_dimension_enabled: true
matrix_dimension_access_token: "***"
matrix_dimension_admins:
- "@kevin_liu:{{ matrix_domain }}"
matrix_s3_media_store_enabled: false
matrix_s3_media_store_bucket_name: "matrix-1302020253"
matrix_s3_media_store_aws_access_key: "***"
matrix_s3_media_store_aws_secret_key: "***"
matrix_s3_media_store_custom_endpoint_enabled: true
# Example: "https://storage.googleapis.com"
matrix_s3_media_store_custom_endpoint: "***"
matrix_bot_matrix_registration_bot_enabled: true
# Token obtained via logging into the bot account (see above)
matrix_bot_matrix_registration_bot_bot_access_token: "***"
# Enables registration
matrix_synapse_enable_registration: true
# Restrict registration to users with a token
matrix_synapse_registration_requires_token: true
matrix_ma1sd_enabled: true
matrix_synapse_log_level: "INFO"
matrix_synapse_storage_sql_log_level: "INFO"
matrix_synapse_root_log_level: "INFO"
Matrix Server:
- OS: archlinux
- Architecture: amd64
Problem description:
Before I reboot my server, the webui is unable to open. I reboot the server because I thought the load of the server is too heavy for the server to run. But after reboot, it still can't open.
Additional context `journalctl -fu matrix-nginx-proxy.service
Aug 30 16:40:57 archlinux systemd[1]: matrix-nginx-proxy.service: Scheduled restart job, restart counter is at 71.
Aug 30 16:40:57 archlinux systemd[1]: Stopped Matrix nginx-proxy server.
Aug 30 16:40:57 archlinux systemd[1]: Starting Matrix nginx-proxy server...
Aug 30 16:40:57 archlinux systemd[1]: Started Matrix nginx-proxy server.
Aug 30 16:40:57 archlinux matrix-nginx-proxy[22486]: docker: Error response from daemon: driver failed programming external connectivity on endpoint matrix-nginx-proxy (87aad82eee715c36c5c704e9b17f295b0e7f3fff8ef4ebceb9705197d88cb30d): Bind for 0.0.0.0:8448 failed: port is already allocated.
Aug 30 16:40:57 archlinux systemd[1]: matrix-nginx-proxy.service: Main process exited, code=exited, status=125/n/a
Aug 30 16:40:57 archlinux systemd[1]: matrix-nginx-proxy.service: Failed with result 'exit-code'.
in this cycle.
See what else could be occupying port 8448
and preventing matrix-nginx-proxy.service
from starting.
netstat -anp | grep :8448
may help.
Perhaps you had a manually installed Synapse in the past?
I installed Dendrite manually in the past(from AUR) And there is the output of netstart
tcp 0 0 0.0.0.0:8448 0.0.0.0:* LISTEN 992/docker-proxy
tcp6 0 0 :::8448 :::* LISTEN 997/docker-proxy
I have stop all the service of matrix by ansible-playbook -i inventory/hosts setup.yml --tags=stop
xI am encountering what I believe to be the same issue.
@Seele-Vollerei32 – Did you find a solution? I'm also curious: is your server pretty low-powered (low memory / CPU)?
Some more details of my issue:
- I've tried switching between
playbook-managed-traefik
andplaybook-managed-nginx
to debug (and because I would take anything that lets me use my server in the short term). The same error message happens when usingmatrix_playbook_reverse_proxy_type: playbook-managed-traefik
and when usingplaybook-managed-nginx
– logs say thatBind for 0.0.0.0:8448 failed: port is already allocated
- The logs for nginx / traefik show that nginx/traefik is repeatedly attempting to start after each failure
- The process using the 8448 port is
docker-proxy
, matching previous comment -
docker ps
shows thatmatrix-synapse
indeed has a binding to8448/tcp
- I've tried doing
just stop-all
->just setup-all
multiple times, thinking that maybematrix-synapse
needs to be down before Traefik/nginx starts; no success - I'm using a low-powered server (Oracle Cloud
VM.Standard.E2.1.Micro
: 1GB of memory + 1GB swapfile). (Could Traefik be racing matrix-synapse? like, maybe other servers launch Traefik more quickly, allowing it to bind to 8448, and somehow matrix-synapse binds to the same port after Traefik launches... somehow? 🤷) - My setup is very vanilla – no custom webserver, nothing else on the machine apart from what's deployed by matrix-docker-ansible-deploy
- Right before I had this issue, I had some failed
setup-all
s, since my machine was running out of memory mid-setup: server became unresponsive and I had to force reboot. This is no longer happening after I added a swapfile. Not sure if this is relevant. -
systemctl list-units
shows thatmatrix-container-socket-proxy.service
isnot found
. Checkingjournalctl
for this service shows some logs includingCan't open server state file '/var/lib/haproxy/server-state': No such file or directory
- My SSL certificates are expired – I started this upgrade process to try fixing certbot failing to autorenew SSL.
-
just setup-all
fails waiting for Traefik / nginx to start (see log below) - I'm pretty sure this is not the same as https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/1687 – I don't see any logs referencing the .pem files when running
playbook-managed-nginx
- Tried the following:
-
docker kill
/stop
ing matrix-synapse, then waiting for nginx to automatically try starting again. This fails with the same error; it looks likematrix-synapse
automatically restarts itself and binds to 8448 again. (Is this correct? Doingdocker inspect matrix-synapse
, I seeRestartPolicy
isno
, which is unexpected.) - Set
matrix_synapse_federation_port_enabled
,matrix_nginx_proxy_proxy_matrix_federation_api_enabled
,matrix_synapse_reverse_proxy_companion_federation_api_enabled
all to false to try to disable the 8448 port on matrix-synapse (following these docs); ransetup-all
– same issue
-
Happy to make a new issue, but this does sound like the same issue.
vars.yml
---
matrix_domain: earthchat.online
matrix_homeserver_implementation: synapse
matrix_homeserver_generic_secret_key: 'redacted'
matrix_ssl_lets_encrypt_support_email: 'redacted'
devture_postgres_connection_password: 'redacted'
matrix_synapse_admin_enabled: true
matrix_sygnal_enabled: true
matrix_sygnal_apps: 'redacted'
# Disable non-required services
matrix_ma1sd_enabled: false
matrix_mailer_enabled: false
matrix_coturn_enabled: false
matrix_playbook_reverse_proxy_type: playbook-managed-nginx
# also tried with:
# matrix_playbook_reverse_proxy_type: playbook-managed-traefik
# devture_traefik_config_certificatesResolvers_acme_email: 'redacted'
journalctl -fu matrix-nginx-proxy.service
(repeating)
Mar 30 16:53:48 synapse-avenue systemd[1]: Started Matrix nginx-proxy server.
Mar 30 16:53:49 synapse-avenue matrix-nginx-proxy[249953]: time="2023-03-30T16:53:49Z" level=error msg="error waiting for container: context canceled"
Mar 30 16:53:49 synapse-avenue matrix-nginx-proxy[249953]: Error response from daemon: driver failed programming external connectivity on endpoint matrix-nginx-proxy (7139dbe699f9e7d414e3eea5d3413dc401f7c88f27c60f6e4fdb125c0bc7a473): Bind for 0.0.0.0:8448 failed: port is already allocated
Mar 30 16:53:49 synapse-avenue systemd[1]: matrix-nginx-proxy.service: Main process exited, code=exited, status=1/FAILURE
Mar 30 16:53:49 synapse-avenue systemd[1]: matrix-nginx-proxy.service: Failed with result 'exit-code'.
Mar 30 16:54:19 synapse-avenue systemd[1]: matrix-nginx-proxy.service: Scheduled restart job, restart counter is at 118.
Mar 30 16:54:19 synapse-avenue systemd[1]: Stopped Matrix nginx-proxy server.
Mar 30 16:54:19 synapse-avenue systemd[1]: Starting Matrix nginx-proxy server...
Mar 30 16:54:20 synapse-avenue matrix-nginx-proxy[250035]: 3ede0cf7b5554906135a5060c094f86b7fdc5cbdc617bcb4813fa6b3c51ca8e7
Mar 30 16:54:20 synapse-avenue systemd[1]: Started Matrix nginx-proxy server.
journalctl -fu matrix-traefik.service
Very similar to nginx above
Mar 30 18:29:53 synapse-avenue systemd[1]: matrix-traefik.service: Failed with result 'exit-code'.
Mar 30 18:30:23 synapse-avenue systemd[1]: matrix-traefik.service: Scheduled restart job, restart counter is at 1777.
Mar 30 18:30:23 synapse-avenue systemd[1]: Stopped Traefik (matrix-traefik).
Mar 30 18:30:23 synapse-avenue systemd[1]: Starting Traefik (matrix-traefik)...
Mar 30 18:30:24 synapse-avenue matrix-traefik[271921]: 2c521dd7c60ee481943eb4235757f020b4c7840cb316ec9a10374b2d7adb4515
Mar 30 18:30:24 synapse-avenue systemd[1]: Started Traefik (matrix-traefik).
Mar 30 18:30:24 synapse-avenue matrix-traefik[271933]: Error response from daemon: driver failed programming external connectivity on endpoint matrix-traefik (2cabc67bfdf14556207a53f1a5990a81be00c17eaaa9d7ec81768d190ada94c5): Bind for 0.0.0.0:8448 failed: port is already allocated
Mar 30 18:30:24 synapse-avenue systemd[1]: matrix-traefik.service: Main process exited, code=exited, status=1/FAILURE
Mar 30 18:30:24 synapse-avenue systemd[1]: matrix-traefik.service: Failed with result 'exit-code'.
journalctl -fu matrix-container-socket-proxy.service
-- Logs begin at Fri 2023-03-17 08:25:14 UTC. --
Mar 30 15:51:39 synapse-avenue systemd[1]: matrix-container-socket-proxy.service: Main process exited, code=exited, status=137/n/a
Mar 30 15:51:39 synapse-avenue systemd[1]: matrix-container-socket-proxy.service: Failed with result 'exit-code'.
Mar 30 15:51:39 synapse-avenue systemd[1]: Stopped Container Socket Proxy (matrix-container-socket-proxy).
Mar 30 18:24:32 synapse-avenue systemd[1]: Starting Container Socket Proxy (matrix-container-socket-proxy)...
Mar 30 18:24:34 synapse-avenue matrix-container-socket-proxy[269618]: e118ffac4824afe0e1aaeaca1e25c947c163cd78105a5cbac241fffb37c00b13
Mar 30 18:24:34 synapse-avenue systemd[1]: Started Container Socket Proxy (matrix-container-socket-proxy).
Mar 30 18:24:38 synapse-avenue matrix-container-socket-proxy[269625]: [WARNING] 088/182438 (1) : Can't open server state file '/var/lib/haproxy/server-state': No such file or directory
Mar 30 18:24:38 synapse-avenue matrix-container-socket-proxy[269625]: [NOTICE] 088/182438 (1) : New worker #1 (7) forked
Mar 30 18:24:38 synapse-avenue matrix-container-socket-proxy[269625]: Proxy dockerbackend started.
Mar 30 18:24:38 synapse-avenue matrix-container-socket-proxy[269625]: Proxy dockerfrontend started.
Failure of `just setup-all`
TASK [galaxy/com.devture.ansible.role.systemd_service_manager : Fail if service isn't detected to be running] ***
failed: [matrix.earthchat.online] (item=matrix-traefik.service) => changed=false
ansible_loop_var: item
item: matrix-traefik.service
msg: matrix-traefik.service was not detected to be running. It's possible that there's a configuration problem or another service on your server interferes with it (uses the same ports, etc.). Try running `systemctl status matrix-traefik.service` and `journalctl -fu matrix-traefik.service` on the server to investigate. If you're on a slow or overloaded server, it may be that services take a longer time to start and that this error is a false-positive. You can consider raising the value of the `devture_systemd_service_manager_up_verification_delay_seconds` variable. See `/redacted/matrix-docker-ansible-deploy/roles/galaxy/com.devture.ansible.role.systemd_service_manager/defaults/main.yml` for more details about that.
I could not figure out what the issue was, but migrating to a new instance by following https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/master/docs/maintenance-migrating.md got me back up and running :(
Had the same issue, for me traefik service would not start due to port being allocated already.
@davidisaaclee's comment pretty much summed up all the symptoms. I was running on a 1gb instance(t3a.micro ec2) and had failed setup-all
's as well which resulted is such broken state. That was a new install, so I didn't need to preserve any configs and after docker system prune -a
, removing /matrix/
and adding a 2gb swap file the install went without a hitch.
I had the very same issue. - But I had some legacy configs in following folder on the host:
/matrix/nginx-proxy/conf.d/
- first I deleted there everything.
After this I noticed some network in docker network ls
seemed odd. After running ansible-playbook -i inventory/hosts setup.yml --tags=stop
- I ran on the hostsystem docker network prune
Now everything works.
Had a similar issue to this today, everything seemed to be working but the built in traefik container kept crashing because of the error
Error response from daemon: driver failed programming external connectivity on endpoint matrix-traefik (---): Bind for 0.0.0.0:8448 failed: port is already allocated
Turns out docker-proxy was using that port for some reason, restarting docker as a whole fixed the issue. Just to be safe though I did a docker system prune --all --volume
to delete all the containers and networks and start over.