moulin
moulin copied to clipboard
Don't immediately quit when server is unavailable on the a heartbeat
When the server is shortly unavailable during a heartbeat, the client currently panics and quits. Instead it should retry for up to some time (the task timeout duration) and only then really quit.
This should make server restarts less risky.
Additionally; currently when the heartbeat fails the go process exits, but the subprocess does not. This may ultimately cause the task to be completed twice, unexpectedly. We should either keep trying until the task is successfully marked failed or succeeded..