GeoHealthCheck icon indicating copy to clipboard operation
GeoHealthCheck copied to clipboard

Retry on http502 error during OWSlib request

Open jochemthart1 opened this issue 4 years ago • 1 comments

This issue is related to issue #366.

Is your feature request related to a problem? Please describe. During a stress test of GHC (lots of resources with high frequency), I am getting mysterious http502 errors during WebFeatureService request in the get_metadata function inside wfs.py, wms.py, etc. These errors can be avoided by retrying requests.

Describe the solution you'd like Currently, in probe.py there is already something in place for this using the create_requests_retry_session() function in util.py. The same principle could be applied to the request that is done in get_metadata functions of each probe (wfs.py, wms.py, etc.). wfs.py

This request however uses the WebFeatureService() function from OWSlib, so the best solution in my opinion would be to use the create_requests_retry_session() in OWSlib. Maybe other OWSlib users have also seen this http502 error? This fix might improve OWSlib for other users as well.

Describe alternatives you've considered An alternative is a simple loop with an exception clause which retries the WebFeatureService() request up to 3 times:

    def get_metadata(self, resource, version='1.1.0', retries=3):
        """
        Get metadata, specific per Resource type.
        :param resource:
        :param version:
        :return: Metadata object
        request_headers = self.get_request_headers()
        """
        for i in range(retries):
            try:
                return WebFeatureService(
                    resource.url,
                    version=version,
                    headers=request_headers)
            except:
                continue
        
        raise Exception(f'Get Metadata Request Exceeded {retries} Retries')

jochemthart1 avatar Jul 30 '21 08:07 jochemthart1

Yes, best to fix with a "retry session" in OWSLib. Though, there can always be cases where there is really a problem: for example a load-balancer with e.g. 3 backend GeoServer instances of which 2 are permanently failing for some reason. Then 3 retries will always succeed, but there is still an undetected problem. Maybe a "Warning" or "Suspicious" type of verdict would be better in those cases. Or are these 502-cases caused within the GHC-Docker environment somehow?

justb4 avatar Aug 30 '21 13:08 justb4