Admin interface becomes intermittently unavailable, restart fixes the problem
Environment & Version
- Docker Compose version
v2.27.1 - Version:
2024.06
Description
Once every week or two the admin interface becomes unavailable. Restarting mailu with the below command fixes the problem:
docker compose down && docker compose up -d --build
The issue is seemingly random but when it happens, I have an opportunity to gather diagnostic information. Please let me know what is needed to progress with the investigation.
Replication Steps
The problem seems to happen randomly and I am unable to reproduce it.
Observed behaviour
You get socket timeout in the browser when connecting to the admin URL.
Logs
I have nagios in place, so I know around what time the admin console becomes inaccessible. I can grab the logs, anonymize them and upload but I would appreciate knowing which ones are going to be useful. Are logs for the mailu-admin container enough?
Logs for admin and front would be useful; if you can get them at a higher log level that would be even better.
in your mailu.env: LOG_LEVEL=debug
Thanks, I will evaluate if I can handle debug level logs and get back to you.
Btw, is there a way to generate a snapshot of the admin interface process? Something similar to javacore but for python. I could log in to the container and generate it few times once I realize that I no longer can connect to the admin console.
I am uploading the anonymized logs. Looks like the problem happened some time after the last successful nagios check:
253.250.255.217 - - [03/Sep/2024:01:44:28 +0200] "GET /sso/login HTTP/1.0" 200 7285 "-" "check_http/v2.4.10 (nagios-plugins 2.4.10)"
This also happend again to me last night. It happend twice before, but initially I thought it was something on my end. docker compose restart front mitigated the problem.
I see that docker compose ps reports mailu-front as unhealthy. It does not seem that there is anything in the logs that indicates this issue. However, it got me wondering. Shouldn't Docker automatically restart the front service if it is not passing health checks? This would be a nice fallback mechanism, but is won't solve the root cause.
Note: I'm using Mailu 2024.06.15 right now.
Whenever this problem happens for me, SMTP port 25 also becomes unavailable for some reason. Other services are never affected, not even other postfix ports. Not sure how the SMTP port and the admin console are related but they must be in some obscure way.
Sounds just like https://github.com/Mailu/Mailu/issues/3398 I just catch this bug too
Issues not for bugs, enhancement requests or discussion go stale after 21 days of inactivity. This issue will be automatically closed after 14 days. For all metrics refer to the stale.yml file. Github issues are not meant for user support. For user-support questions, reach out on the matrix support channel.
Mark the issue as fresh by simply adding a comment to the issue. If this issue is safe to close, please do so now.
This issue has not seen activity since as it has become stale. Stale issues are automatically closed after 14 days.