fission-workflows
fission-workflows copied to clipboard
Nats connection closed
After certain time the workflow pod looses connection to the Nats. This is fixed after the workflow pod is restarted.
time="2018-09-05T04:23:56Z" level=info msg="Event added: WORKFLOW_CREATED" aggregate=workflow/29dc36fc-b037-11e8-8f31-42010aa0001c nats.subject=workflow.29dc36fc-b037-11e8-8f31-42010aa0001c parent=/
time="2018-09-05T04:23:56Z" level=error msg="Request error: nats: connection closed"
time="2018-09-05T04:23:56Z" level=error msg="failed to specialize: failed to specialize package: failed to store workflow internally: nats: connection closed"
This leads to 502 on any workflow function being called. We should have some way to check and restore connection to nats so that the user does not have to face 502.
Thanks @vishal-biyani for the issue! Most of the reconnecting logic is handled by the nats client, but apparently not in all cases. To fix this, we should indeed add functionality to reconnect to NATS upon connection loss.