zeebe-chaos icon indicating copy to clipboard operation
zeebe-chaos copied to clipboard

Deployment distribution test fails because the pods are not re-connected

Open deepthidevaki opened this issue 11 months ago • 1 comments

We see the zeebe-1 and zeebe-2 were disconnected before. When reconnecting, there was an error in Zeebe-1. But we don't see a retry for the same job. As a result zeebe-1 and zeebe-2 remain disconnected. Thus the test fails because the deployment is not distributed.

DEBUG 2024-02-28T17:25:55.725725840Z [labels.title: Zeebe deployment distribution] Execute ["sh" "-c" "command -v ip"] on pod zeebe-1
DEBUG 2024-02-28T17:25:55.817082041Z [labels.title: Zeebe deployment distribution] Execute ["sh" "-c" "ip route | grep -m 1 unreachable"] on pod zeebe-1
DEBUG 2024-02-28T17:25:55.909347244Z [labels.title: Zeebe deployment distribution] Execute ["sh" "-c" "ip route del "] on pod zeebe-1
DEBUG 2024-02-28T17:25:55.970531110Z [labels.title: Zeebe deployment distribution] Error on connection Broker: zeebe-1. Error: command terminated with exit code 255

deepthidevaki avatar Feb 29 '24 09:02 deepthidevaki

ZDP-Triage:

  • happened on a QA run
  • so far doesn't seem to happen too often, we may revisit

megglos avatar Feb 29 '24 13:02 megglos

I'm closing this, please reopen if it happens again.

npepinpe avatar Sep 03 '24 14:09 npepinpe