dataall icon indicating copy to clipboard operation
dataall copied to clipboard

ECS task fails and crashes when RDS queries return error

Open TejasRGitHub opened this issue 1 year ago • 3 comments

Is your idea related to a problem? Please describe. The way ECS verifier task has been coded, it fetches all the active shares and loops through each of them and then applies share verifier. Suppose there is failure as pointed out in the below screenshots, image

image

In this case, the share verifier crashes and exits out without completing verification for all the shares.

Describe the solution you'd like Add robustness by adding try except blocks on the top level and handle exceptions arising from the RDS querying. Find and check if there are any exceptions on which the share verifier should crash and stop and allow other exceptions to be logged and let the share verifier service run.

P.S. Don't attach files. Please, prefer add code snippets directly in the message body.

TejasRGitHub avatar May 23 '24 14:05 TejasRGitHub

Hi @TejasRGitHub ! I believe this issue is the same as #1266 . Thanks for the update! Currently it's in our ToDo-list.

SofiaSazonova avatar May 23 '24 14:05 SofiaSazonova

Hi @SofiaSazonova , thanks for pointing to this issue. Although that issue is specifically towards the share manager, I will add a comments and reference this issue. Please let me know or feel free to close this issue in favour or https://github.com/data-dot-all/dataall/issues/1266

TejasRGitHub avatar May 23 '24 14:05 TejasRGitHub

I think this could be a quick implementation to add the additional robustness for the share verifier ECS task by wrapping each item processed in the loop in a try/except block

I am going to nominate this issue as a candidate for v2.7 separate from #1266 which I think details additional proposed changes

noah-paige avatar Jul 12 '24 18:07 noah-paige