hato icon indicating copy to clipboard operation
hato copied to clipboard

Consider adding a retry handler

Open vincentjames501 opened this issue 5 years ago • 5 comments

It’s common to want to retry (typically idempotent requests like GET/HEAD/OPTIONS) on certain failures such as IOExceptions. The JDK HttpClient does have some retry logic for ConnectException baked in, but it isn’t configurable. We’ve ran into some issues with the JDK HttpClient in prod with some random IOExceptions and connection reset issues for long lived connections which a simple retry would’ve been sufficient to get past the transient issue. Apache HttpClient (and clj-http by extension) by default will retry certain requests up to 3 times which is great and allows supplying a custom retry handler for doing things like exponential backoffs, dealing with rate limiting, etc which would be awesome to have for a production-ready client. It’s unfortunate the JDK HttpClient doesn’t expose this logic by default.

Thoughts?

vincentjames501 avatar Dec 14 '20 04:12 vincentjames501

Thanks for raising this. Given this is Clojure, it is pretty easy to wrap your requests to allow arbitrary retry logic on status codes / backoffs etc. Maybe the first step would be to document an example of this, and if it proves sufficiently useful/flexible, add it in as an opt-in feature via {:retry :auto}?

gnarroway avatar Dec 16 '20 01:12 gnarroway

Wrapping it in a generic way is a little tricky as :async? and :throw-exceptions change the logic just enough to be annoying. That's actually one of the things I liked so much about http-kit was it was always async and there was no :throw-exceptions type option. For reference here is what clj-http does:

;; Apache's http client automatically retries on IOExceptions, if you
;; would like to handle these retries yourself, you can specify a
;; :retry-handler. Return true to retry, false to stop trying:
(client/post "http://example.org" {:multipart [["title" "Foo"]
                                               ["Content/type" "text/plain"]
                                               ["file" (clojure.java.io/file "/tmp/missing-file")]]
                                   :retry-handler (fn [ex try-count http-context]
                                                    (println "Got:" ex)
                                                    (if (> try-count 4) false true))})

The :retry-handler abstraction is nice and simple. I'd vote for something like this vs only having {:retry :auto} or having :auto just map to a default retry-handler.

We have lots of services and copying retry logic can be tedious. We have a library now wrapping hato (using a common httpclient, timeouts, redirect policies, etc) where we're adding this but would certainly be nicer to have first class support as anyone using an http client in production wants it be resilient to transient failures w/o writing a bunch of custom code in most cases (definitely a selling point of OkHttp and Apache HttpClient).

For reference, OkHttp has some default retry logic, but can be customized via the interceptor pattern:

client.interceptors().add(new Interceptor() {
    @Override
    public Response intercept(Chain chain) throws IOException {
        Request request = chain.request();
        Response response = chain.proceed(request);
        int tryCount = 0;
        int maxLimit = 3;
        while (!response.isSuccessful() && tryCount < maxLimit) {
            tryCount++;
            // retry the request
            response = chain.proceed(request);
        }
        return response;
    }
});

vincentjames501 avatar Dec 16 '20 01:12 vincentjames501

Makes sense. Nice explanation. Let me review the PR.

gnarroway avatar Dec 19 '20 00:12 gnarroway

Would be nice to have this feature, but maybe not an entirely related problem and could be solved with something like https://github.com/resilience4clj/resilience4clj-retry more generically?

AndreaCrotti avatar Aug 02 '21 10:08 AndreaCrotti

@AndreaCrotti , I think folks can build a custom solution with resilience4j easily enough but it's difficult to do in a generic way as :async? and :throw-exceptions change the logic just enough to be annoying. I think resilience4clj-retry would work well if you always do sync requests and throw-exceptions. Additionally, nearly all the most popular HTTP clients have a way to specify a retry handler/interceptor of some kind and the solution posted in my MR will likely handle 95% of use cases. If folks need something more powerful, using something resilience4clj-retry would make sense.

vincentjames501 avatar Aug 02 '21 21:08 vincentjames501