docker-transmission-openvpn
docker-transmission-openvpn copied to clipboard
Container exits with code 0 but it should be a non-zero code
Is there a pinned issue for this?
- [X] I have read the pinned issues and could not find my issue
Is there an existing or similar issue/discussion for this?
- [X] I have searched the existing issues
- [X] I have searched the existing discussions
Is there any comment in the documentation for this?
- [X] I have read the documentation, especially the FAQ and Troubleshooting parts
Is this related to a provider?
- [X] I have checked the provider repo for issues
- [X] My issue is NOT related to a provider
Are you using the latest release?
- [X] I am using the latest release
Have you tried using the dev branch latest?
- [X] I have tried using dev branch
Docker run config used
version: '3.7'
services:
transmission-openvpn:
container_name: transmission-openvpn
cap_add:
- NET_ADMIN
.....
.....
.....
autoheal: # https://github.com/willfarrell/docker-autoheal
image: willfarrell/autoheal
environment: # https://github.com/willfarrell/docker-autoheal#env-defaults
- AUTOHEAL_CONTAINER_LABEL=autoheal
- AUTOHEAL_INTERVAL=5 # check every 5 seconds
- AUTOHEAL_START_PERIOD=30 # wait 0 seconds before first health check
- AUTOHEAL_DEFAULT_STOP_TIMEOUT=10 # Docker waits max 10 seconds (the Docker default) for a container to stop before killing during restarts (container overridable via label, see below)
- DOCKER_SOCK=/var/run/docker.sock # Unix socket for curl requests to Docker API
- CURL_TIMEOUT=10 # --max-time seconds for curl requests to Docker API
- WEBHOOK_URL="" # post message to the webhook if a container was restarted (or restart failed)
volumes:
- '/var/run/docker.sock:/var/run/docker.sock'
Current Behavior
transmission-openvpn | [Secure-Server] Inactivity timeout (--ping-restart), restarting
transmission-openvpn | /etc/openvpn/tunnelDown.sh tun0 ***************************************** init
transmission-openvpn | resolv.conf was restored
transmission-openvpn | Sending kill signal to transmission-daemon
transmission-openvpn | Waiting 5s for transmission-daemon to die
transmission-openvpn | Successfuly closed transmission-daemon
.....
.....
.....
transmission-openvpn | SIGTERM[soft,ping-restart] received, process exiting
transmission-openvpn exited with code 0
The container exits with code 0.
Expected Behavior
The container shall exit with a non-zero code, to indicate failure, i.e. unhealthiness, so the autoheal
functionality kicks in.
How have you tried to solve the problem?
This is an essential issue with the Docker image, so upstream needs to fix it.
Log output
transmission-openvpn | [Secure-Server] Inactivity timeout (--ping-restart), restarting
transmission-openvpn | /etc/openvpn/tunnelDown.sh tun0 ***************************************** init
transmission-openvpn | resolv.conf was restored
transmission-openvpn | Sending kill signal to transmission-daemon
transmission-openvpn | Waiting 5s for transmission-daemon to die
transmission-openvpn | Successfuly closed transmission-daemon
.....
.....
.....
transmission-openvpn | SIGTERM[soft,ping-restart] received, process exiting
transmission-openvpn exited with code 0
HW/SW Environment
- OS: Debian GNU/Linux 11 (bullseye)
- Docker: 20.10.17
Anything else?
I did not find the script that manages the error catching & manipulation, in this repository. I can fix it, if you show me where these OpenVPN related errors are handled & manipulated.
This is why I decided to use Autoheal.
- https://github.com/haugene/docker-transmission-openvpn/blob/25b9724178f48227084f5a462b82b1fbc087498d/docs/faq.md#set-the-ping-exit-option-for-openvpn-and-restart-flag-in-docker
- https://github.com/haugene/docker-transmission-openvpn/blob/25b9724178f48227084f5a462b82b1fbc087498d/docs/faq.md#use-a-third-party-tool-to-monitor-and-restart-the-container
this is where the sigterm is trapped https://github.com/haugene/docker-transmission-openvpn/blob/master/scripts/healthcheck.sh
this is where the sigterm is trapped https://github.com/haugene/docker-transmission-openvpn/blob/master/scripts/healthcheck.sh
That's not the right script. Not related to the problem, at all.
I finally found the problem.
https://github.com/haugene/docker-transmission-openvpn/blob/e6fd367db74075e2b507d420191a55a43b5e8d90/openvpn/modify-openvpn-config.sh#L47
ping-exit
lets OpenVPN exit with a zero exit code on failure. This makes the container stop "properly", so it does not count as a failure.
I do not see an easy way to prevent this, except making significant changes or adding another tool just for checking this, which would be overkill.
If you read the script you can disable each of the modifications if you want.
On Thu, 15 Sep 2022 at 06:43, theAkito @.***> wrote:
this is where the sigterm is trapped
https://github.com/haugene/docker-transmission-openvpn/blob/master/scripts/healthcheck.sh
That's not the right script. Not related to the problem, at all.
I finally found the problem.
https://github.com/haugene/docker-transmission-openvpn/blob/e6fd367db74075e2b507d420191a55a43b5e8d90/openvpn/modify-openvpn-config.sh#L47
ping-exit lets OpenVPN exit with a zero exit code on failure. This makes the container stop "properly", so it does not count as a failure.
I do not see an easy way to prevent this, except making significant changes or adding another tool just for checking this, which would be overkill.
— Reply to this email directly, view it on GitHub https://github.com/haugene/docker-transmission-openvpn/issues/2341#issuecomment-1247332147, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7OFYUH62K7EY4FCNK53F3V6JBI3ANCNFSM6AAAAAAQJ47ATU . You are receiving this because you commented.Message ID: @.***>
If you read the script you can disable each of the modifications
I don't think the modifications are bad. They are pretty good. It's just that ping-exit
works the wrong way and there is no reasonable way to change that in OpenVPN.
So, to circumvent this, it would be necessary to issue a lot of effort into solving such a seemingly small problem.
Which is why I'm not sure how this problem should be approached.
Sorry, I just reread the initial issue. But I’m still not sure exactly why you are trying to do this in such a way.. we have an auto heal script built in which sets the autoheal flag and reports the container as unhealthy.. that’s when you can use the third party container to Restart it… Soo I don’t see the issue here
Sorry, I just reread the initial issue. But I’m still not sure exactly why you are trying to do this in such a way.. we have an auto heal script built in which sets the autoheal flag and reports the container as unhealthy.. that’s when you can use the third party container to Restart it… Soo I don’t see the issue here
The issue is, that any healthcheck that was built in is absolutely useless in this situation. The reason for that is, that this whole disconnecting and exiting process takes about 3 seconds. So, the container is already long gone with a zero exit code, before any healthcheck mechanism had the chance to kick in. Now, when it exits with this zero exit code, the container stopped "gracefully" according to Docker, which means there is nothing to fix or restart, from the view of autoheal, Docker or whatever keepalive mechanism there is.
Therefore, it would've been ideal, if the container stopped with a non-zero exit code to begin with, which would make the autoheal method work and Docker's on-failure
would also work.
As it is the case now, the only way to make this container restart is using Docker's restart policy unless-stopped
. All other alternatives currently do not work, but are meant to work and should work.
Well, For me it works fine based on connectivity check and when the vpn server disconnects, Container gets marked as unhealthy and is restarted.
On Fri, 16 Sep 2022 at 20:36, theAkito @.***> wrote:
Sorry, I just reread the initial issue. But I’m still not sure exactly why you are trying to do this in such a way.. we have an auto heal script built in which sets the autoheal flag and reports the container as unhealthy.. that’s when you can use the third party container to Restart it… Soo I don’t see the issue here
The issue is, that any healthcheck that was built in is absolutely useless in this situation. The reason for that is, that this whole disconnecting and exiting process takes about 3 seconds. So, the container is already long gone with a zero exit code, before any healthcheck mechanism had the chance to kick in. Now, when it exits with this zero exit code, the container stopped "gracefully" according to Docker, which means there is nothing to fix or restart, from the view of autoheal, Docker or whatever keepalive mechanism there is.
Therefore, it would've been ideal, if the container stopped with a non-zero exit code to begin with, which would make the autoheal method work and Docker's on-failure would also work.
As it is the case now, the only way to make this container restart is using Docker's restart policy unless-stopped. All other alternatives currently do not work, but are meant to work and should work.
— Reply to this email directly, view it on GitHub https://github.com/haugene/docker-transmission-openvpn/issues/2341#issuecomment-1249256177, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7OFYVHCUXFK7NTA3GAR5TV6RLSPANCNFSM6AAAAAAQJ47ATU . You are receiving this because you commented.Message ID: @.***>
For me it works fine based on connectivity check and when the vpn server disconnects, Container gets marked as unhealthy and is restarted.
It's not the case though, because then autoheal would've worked to begin with in my case and I wouldn't have created this issue in the first place.
Maybe your restart policy is set to unless-stopped
.
There is no way the healthcheck can fix this, when the container is dead with a zero exit code, so quickly.