matrix-docker-ansible-deploy icon indicating copy to clipboard operation
matrix-docker-ansible-deploy copied to clipboard

Own traefik proxy: syslog flooded with sync calls (10 GB per day)

Open sophonet opened this issue 1 year ago • 11 comments

Describe the bug

I am using the current version of the playbook with an own traefik frontend, configuration has been done according to the guidelines in this repo. However, after some time, my system's syslog messages start filling up with "circular" sync calls between the matrix-nginx-proxy (which I have understood is still there) and the matrix-synapse-reverse-proxy-companion. They fill up about at least 10 GB per day, so currently I have to manually delete them and reboot my server in order not to fill up the hard disk.

To Reproduce My vars.yml file looks like this:


matrix_domain: <my domain name, hidden here, has worked before>

matrix_homeserver_implementation: synapse

matrix_homeserver_generic_secret_key: < Secret key, hidden here >
devture_postgres_connection_password: < password, hidden here >
matrix_server_fqn_element: < element fqn >
matrix_server_fqn_matrix: < matrix fqn >
matrix_synapse_admin_enabled: true
matrix_playbook_docker_installation_enabled: false
matrix_ma1sd_enabled: false

matrix_mautrix_signal_enabled: true
matrix_mautrix_whatsapp_enabled: true

matrix_base_data_path: /srv/data/services/matrix/matrix

matrix_ssl_retrieval_method: none

matrix_playbook_reverse_proxy_type: other-traefik-container
matrix_playbook_reverse_proxyable_services_additional_network: 'matrix'
devture_traefik_certs_dumper_ssl_dir_path: /srv/data/services/reverseproxy
matrix_coturn_enabled: false
matrix_docker_network: 'matrix'

Expected behavior

Earlier, my matrix installation was running as expected, without gigabytes of log messages containing apparent "circular" calls.

Matrix Server:

  • OS: Docker 23.0.3 on Debian Bullseye (running in promox lxc container)
  • Architecture: amd64

Ansible: (not the issue)

Client: (probably not the issue, has worked flawlessly before the move to traefik)

Additional context

/var/log/syslog and /var/log/daemon.log look like (small snippet, this fills up gigabytes - with a few other messages obviously):

Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-nginx-proxy[38635]: 172.24.0.11 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.1" 200 205 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"
Apr 28 17:49:40 myhost matrix-synapse-reverse-proxy-companion[37725]: 172.24.0.7 - - [28/Apr/2023:17:49:40 +0000] "GET /_matrix/client/v3/sync?timeout=30000&since=s15404_424384_8_10141_15562_3_183_59_0_1&filter=1&set_presence=offline HTTP/1.0" 200 256 "-" "mautrix-signal/0.4.2 mautrix-python/0.18.8 aiohttp/3.8.3 Python/3.10.8" "-"

sophonet avatar Apr 28 '23 17:04 sophonet

I'm having the same issue

Are you using mautrix-signal too?

rltas avatar May 11 '23 15:05 rltas

Please let's not confuse probably unrelated issues.

"circular" sync calls

You mean a loop? Requests are passed from proxy to proxy companion, could also be massive spam from mautrix-signal, although all requests looking exactly the same surely looks fishy. Does it disappear if you stop just mautrix-signal?

rltas avatar May 11 '23 18:05 rltas

Please let's not confuse probably unrelated issues.

"circular" sync calls

You mean a loop? Requests are passed from proxy to proxy companion, could also be massive spam from mautrix-signal, although all requests looking exactly the same surely looks fishy. Does it disappear if you stop just mautrix-signal?

You're right... I looked more closely at OP's logs. I too have the GET requests to /sync failing but with different specifics. Also, as I noted above (now deleted), my issues are not related to mautrix-signal. Sorry for the consuion.

jpfulton248 avatar May 11 '23 19:05 jpfulton248

havin the same issue. server gets stuck every few hours due to extreme logfile flooding, looks the same as OP

watzfatz avatar Jun 10 '23 12:06 watzfatz

Please let's not confuse probably unrelated issues.

"circular" sync calls

You mean a loop? Requests are passed from proxy to proxy companion, could also be massive spam from mautrix-signal, although all requests looking exactly the same surely looks fishy. Does it disappear if you stop just mautrix-signal?

Indeed, the heavy log files seem to be related to using mautrix-signal in my setup. If I stop two signal containers (matrix-mautrix-signal and matrix-mautrix-signal-daemon), the error messages stop appearing. They appear again when the services are automatically relaunched. Not sure 100% though though whether the calls are indeed recursive...

As a temporary workaround, I have disabled the log messages from these two services in /etc/rsyslog.d/matrix.conf:

:programname, isequal, "matrix-synapse-reverse-proxy-companion" stop
:programname, isequal, "matrix-nginx-proxy"     stop

Nevertheless, the problem is not solved and leads to client applications (e.g. MacOS element) to hang (probably due to countless queued backend calls) after having run for some time such that they need to be killed and restarted in order to work again.

It would be great if I would know how to continue debugging / getting more information in order to help solve this.

sophonet avatar Jun 11 '23 14:06 sophonet

It might not be relevant, but what you describe sounds similar to what happens if Postgres runs out of connections.

You could try to increase the max number of connections as described here: https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/14f7eed9324b58f4acb264f0cab3b15bfd10ac07/docs/configuring-playbook-synapse.md#load-balancing-with-workers

QEDeD avatar Jul 25 '23 08:07 QEDeD

I am having this issue as well. I upgraded to the latest version of the playbook today and upgraded postgres. I also increased the number of max connections as per the link from QEDed. I disabled all bridges (slack, whatsapp, discord) but the issue remains. The logs indicate that it is Element itself that is causing the issue.

I've attached a snippet that pretty much repeats until the log is full.

Here you can see different clients accessing via different browsers and the app as well. Everything works normally until the logs fill up and there's no more space on the server. Jonathan

geoidwalker avatar Aug 08 '23 17:08 geoidwalker

logs.txt

geoidwalker avatar Aug 08 '23 17:08 geoidwalker

@geoidwalker that looks like presence spam caused by https://github.com/matrix-org/synapse/issues/16039 https://github.com/matrix-org/synapse/issues/16057 They reverted the change, but old clients will keep causing the problem.

rltas avatar Aug 09 '23 07:08 rltas

@geoidwalker that looks like presence spam caused by matrix-org/synapse#16039 matrix-org/synapse#16057 They reverted the change, but old clients will keep causing the problem.

Indeed, this might have been the case - strange that the two issues describing the presence spam have been created months later after my observation. I have just enabled syslog logging again and log messages seem to be fine now. If this persists for the next week or so, I am going to close this issue. Thanks for the info!

sophonet avatar Oct 31 '23 10:10 sophonet

I'm using this playbook reverse-proxied behind apache. I get 10M lines from matrix in syslog daily. Most of them look like an internal access log. 60% of them look like this:

Dec 10 00:01:50 hostname matrix-synapse-reverse-proxy-companion[351640]: 172.22.0.19 - - [09/Dec/2023:23:01:50 +0000] "PUT /_matrix/federation/v1/send/1702037848532 HTTP/1.0" 200 11 "-" "Synapse/1.97.0 (b=matrix-org-hotfixes,f0b69250b5)" "172.26.0.1, 172.27.0.5"
Dec 10 00:01:50 hostname matrix-nginx-proxy[344948]: 172.22.0.19 - - [09/Dec/2023:23:01:50 +0000] "PUT /_matrix/federation/v1/send/1702037848532 HTTP/1.0" 200 11 "-" "Synapse/1.97.0 (b=matrix-org-hotfixes,f0b69250b5)" "172.26.0.1, 172.27.0.5"
Dec 10 00:01:50 hostname matrix-nginx-proxy[344948]: 172.27.0.5 - - [09/Dec/2023:23:01:50 +0000] "PUT /_matrix/federation/v1/send/1702037848532 HTTP/1.1" 200 21 "-" "Synapse/1.97.0 (b=matrix-org-hotfixes,f0b69250b5)" "172.26.0.1"
Dec 10 00:01:50 hostname matrix-traefik[346779]: 172.26.0.1 - - [09/Dec/2023:23:01:50 +0000] "PUT /_matrix/federation/v1/send/1702037848532 HTTP/1.1" 200 11 "-" "-" 3747256 "matrix-nginx-proxy-matrix-federation@docker" "http://172.27.0.4:8448" 9ms
Dec 10 00:01:50 hostname matrix-synapse-reverse-proxy-companion[351640]: 172.22.0.19 - - [09/Dec/2023:23:01:50 +0000] "PUT /_matrix/federation/v1/send/1702037848553 HTTP/1.0" 200 11 "-" "Synapse/1.97.0 (b=matrix-org-hotfixes,f0b69250b5)" "172.26.0.1, 172.27.0.5"
Dec 10 00:01:50 hostname matrix-nginx-proxy[344948]: 172.22.0.19 - - [09/Dec/2023:23:01:50 +0000] "PUT /_matrix/federation/v1/send/1702037848553 HTTP/1.0" 200 11 "-" "Synapse/1.97.0 (b=matrix-org-hotfixes,f0b69250b5)" "172.26.0.1, 172.27.0.5"
Dec 10 00:01:50 hostname matrix-nginx-proxy[344948]: 172.27.0.5 - - [09/Dec/2023:23:01:50 +0000] "PUT /_matrix/federation/v1/send/1702037848553 HTTP/1.1" 200 21 "-" "Synapse/1.97.0 (b=matrix-org-hotfixes,f0b69250b5)" "172.26.0.1"
Dec 10 00:01:50 hostname matrix-traefik[346779]: 172.26.0.1 - - [09/Dec/2023:23:01:50 +0000] "PUT /_matrix/federation/v1/send/1702037848553 HTTP/1.1" 200 11 "-" "-" 3747257 "matrix-nginx-proxy-matrix-federation@docker" "http://172.27.0.4:8448" 10ms

30% of them look like this:

Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.704] dockerfrontend dockerbackend/dockersocket 0/0/0/15/15 200 163734 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/json?limit=0 HTTP/1.1"
Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.723] dockerfrontend dockerbackend/dockersocket 0/0/0/0/0 200 5567 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/fc321fb9f7ee8d5cc17b5fc9af523b39bbf85c41407e6b87fbd46712b3549cae/json HTTP/1.1"
Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.724] dockerfrontend dockerbackend/dockersocket 0/0/0/1/1 200 12085 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/2ee6b5bb7b5a652cda66bc28d010c20974ff2073155b6cf9c9cecf45ddb6f50f/json HTTP/1.1"
Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.725] dockerfrontend dockerbackend/dockersocket 0/0/0/0/0 200 7931 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/5fd24a1e2393076aeffb5e5aff769b38ec4484d2f8a5d570f9424f6e70e1eb9a/json HTTP/1.1"
Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.726] dockerfrontend dockerbackend/dockersocket 0/0/0/0/0 200 7612 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/2affe32ad208f64eab8d54082e35fe6671b3d6e58547ca441188406b1647b122/json HTTP/1.1"
Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.727] dockerfrontend dockerbackend/dockersocket 0/0/0/0/0 200 6660 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/7d1f1afaa975f07eb43d8a96ebafebf801a5b9ca53f622d297ec3fc624a9424e/json HTTP/1.1"
Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.728] dockerfrontend dockerbackend/dockersocket 0/0/0/0/0 200 7319 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/5f211bf112fbfeb5945a5280d98311d3f73d122fece0fc01d70daaac78e79ef0/json HTTP/1.1"
Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.729] dockerfrontend dockerbackend/dockersocket 0/0/0/0/0 200 8528 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/6f57c6b3a97b90a3d3461827a3b05c0c562e46d3256e5ff4551b2c0fab145255/json HTTP/1.1"
Dec 10 00:04:37 hostname matrix-container-socket-proxy[346463]: 172.26.0.3:56426 [09/Dec/2023:23:04:37.730] dockerfrontend dockerbackend/dockersocket 0/0/0/0/0 200 7803 - - ---- 2/2/1/1/0 0/0 "GET /v1.24/containers/c95aec3adaa36e86ba4eeaf79e6f12e2644a826ba39c0283fd57f842da52f12d/json HTTP/1.1"

ristein avatar Dec 11 '23 18:12 ristein

There was no more activity - in my case, the presence spam was the reason, updating to newer clients solved this issue

sophonet avatar Mar 03 '24 09:03 sophonet

An old client can still cause this issue? So... they didn't actually fix it, did they?

emiliogon avatar May 31 '24 22:05 emiliogon

See above, it has been fixed, this issue is closed

sophonet avatar Jun 01 '24 06:06 sophonet