ozone
ozone copied to clipboard
HDDS-10702. Improve Recon startup failure handling and make it more resilient.
What changes were proposed in this pull request?
This PR is to address the recon initialisation issues due to SCM inherited code or other recon startup errors. This PR has introduced a new context variable known as ReconContext
to hold information for recon health and other startup errors in various other recon modules. As part of this change, ReconContext
is being used in ReconSCM flow initialisation and can be injected later for other modules as well. Information holding inside ReconContext
can be used later to give meaningful message to user on Recon UI.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10702
How was this patch tested?
Tested manually by adding Junit test case.
@dombizita @ArafatKhan2198 kindly review.
@devmadhuu Thanks for working over this, unable to get usages of ReconContext. Here its just populating data but not used anywhere except test code. Further its just supressing the error being stored in context, but not making recon failure.
This PR objective is just to store data in ReconContext and will be used later in another PR where an API will expose the meaningful info to UI. We don't want Recon start up to fail, rather we want to expose what information is not available at Recon because of what reason.
@devmadhuu Given few minor comment for this. This implementation is just avoid InvalidTopologyException. So next PR will show this information over UI or alert about the issue as health report, right?
Yes, currently this PR is just to provide a way for Recon to show meaningful information over UI for failures. This PR is handling InvalidTopologyException
, but later ReconContext can hold error or failure information for other types of failures as well which can be used to show over Recon UI.
Thanks for updating the patch @devmadhuu the changes look good! Could you please take a look at the failing Tests in your fork :- https://github.com/devmadhuu/ozone/actions/runs/9015109706/job/24769697868