Rodrigo Lopes
Rodrigo Lopes
This issue seems to have happened again here: https://console.cloud.google.com/errors/detail/CJWf2o7n1-2ipwE;service=zeebe;time=P7D?project=camunda-saas-prod
This has happened again [here](https://console.cloud.google.com/logs/query;query=error_groups.id%3D%22CP35roSVzbahbg%22%0AlogName:%22stdout%22%0Aresource.type%3D%22k8s_container%22%0Aresource.labels.location%3D%22us-east1%22%0Aresource.labels.container_name%3D%22zeebe%22%0Aresource.labels.project_id%3D%22camunda-cloud-240911%22%0Aresource.labels.cluster_name%3D%22prod-worker-2%22%0Aresource.labels.pod_name%3D%22zeebe-1%22%0Aresource.labels.namespace_name%3D%2232d4f193-227e-4689-a711-9125dd449331-zeebe%22;cursorTimestamp=2024-04-19T14:30:36.094475413Z;startTime=2024-04-19T14:01:06.094Z;endTime=2024-04-19T15:01:06.094Z?project=camunda-cloud-240911).
We just finished the last step for QA so I will mark this epic as done.
This long output of error message has happened again [here](https://console.cloud.google.com/logs/query%3Bquery=logName:%22stdout%22%0Aresource.type=%22k8s_container%22%0Aresource.labels.container_name=%22operate-importer%22%0Aresource.labels.cluster_name=%22worker-3%22%0Aresource.labels.location=%22europe-north1%22%0Aresource.labels.namespace_name=%22abbaad17-963f-42eb-8167-66ba5b263f87-zeebe%22%0Aresource.labels.project_id=%22camunda-saas-int%22%0A%3BpinnedLogId=2024-04-19T07:25:11.400822894Z/kq3putc0qs2t2hgz%3BcursorTimestamp=2024-04-19T07:25:11.398708769Z%3BstartTime=2024-04-19T06:55:41.400Z%3BendTime=2024-04-19T07:55:41.400Z?project=camunda-saas-int), although the cluster seems to have been deleted after.
Happened again here https://github.com/camunda/zeebe/actions/runs/8925112776/job/24513039306?pr=18184
Further findings: It is relatively easy to reproduce the test locally, by just rerunning until failure the test. (It takes on average less than 10 times on my machine). Similar...
Sorry I got the libraries confused while I was writing, I meant to say the one from AWS. While looking again at this issue it seems that in the linked...
Triage: @mustafadagher was there any more recent examples of this happening so that we can access the likelihood?
I ran the test locally around 100 times plus also the QA integration tests here in the CI 5 times just to be sure that is not flaky.
I was not able to reproduce the issue locally.  The output on the right is from the run that showed the flakiness. On the second transition, we can see...