req icon indicating copy to clipboard operation
req copied to clipboard

Would it make sense to make retry `transient?` function public?

Open jozuas opened this issue 1 year ago • 4 comments

I'm thinking that it would be nice to have the ability to define your own retry function, and in the retry function if the conditions that I care about don't match, default to https://github.com/wojtekmach/req/blob/85bd7437488eeaf013659d1bebdce0511333d4ae/lib/req/steps.ex#L2292

I'll be happy to make a PR if this sounds reasonable.

jozuas avatar Jun 11 '24 16:06 jozuas

This is very arbitrary but I'd like to keep Req.Steps to just have steps functions as it's already pretty big module. The downside is we can't have this function which I agree would be very convenient to have. Perhaps there is another way to solve this?

wojtekmach avatar Jun 11 '24 18:06 wojtekmach

I had a look at Req.Steps to see if there's a more general pattern that we could take advantage of, but sadly it seems that only the transient? function would benefit from being exposed. Creating a whole new module e.g. Req.Conveniences would not look so great if there's only one function there of interest.

The retry function implementation can return a tuple {:delay, milliseconds}, maybe we could support new options {:retry, :transient} and {:retry, :safe_transient}?

    Req.new(
      ...,
      retry: fn
        _req, %Req.Response{status: 403} ->
          # Hypothetical service that returns HTTP 403 instead of 429 for "Too Many Requests" 
          {:delay, 30_000}

        _req, _resp ->
          {:retry, :safe_transient}
      end
    )

jozuas avatar Jun 11 '24 19:06 jozuas

@wojtekmach how does the proposal sound? Is this something I could start working on that the Req library would be happy to accept?

jozuas avatar Jun 20 '24 16:06 jozuas

Sorry I'm still considering it.

wojtekmach avatar Jun 23 '24 07:06 wojtekmach

Public transient? can also be useful in testing. When we want to make retry delays short in tests, we need to provide custom retry function. But if we also want to use transient retry logic (which is pretty much always), we now need to re-implement Req internals in our own code (e.g. checking status codes and error types), while we could just use public transient?

andreyvolokitin avatar Apr 09 '25 14:04 andreyvolokitin

@andreyvolokitin can you share how you are using retry? You can set just the :retry_delay option to 0 in tests.

wojtekmach avatar Apr 10 '25 08:04 wojtekmach

I control max_retries and initial retry delay because I wanted to test backoff functionality (e.g. check if retry delay increases as expected) in integration test, and I thought that when using :retry_delay as a function we're loosing built-in retry logic (transient and Retry-After header checks). But it appears that :retry_delay is just calculating actual delay and all other logic stays in place. So retry option is used if we want to control retry logic completely, and :retry_delay is used to just control delay (which thinking about it is obvious).

So if not testing backoff intervals I indeed could use :retry_delay: 0, and if wanting more control I could use :retry_delay: backof_fn and control my initial retry delay there. And currently I use retry to essentially re-implement everything:

  defp req(my_api_options) do
    my_api_options =
      Keyword.validate!(options,
        retry: false
      )

    if(my_api_options[:retry], do: req_retry_options(), else: [])
    |> Keyword.merge(base_url: "https://api.example.com")
    |> Req.new()
  end

  defp req_retry_options do
    [
      retry: fn request, response_or_exception ->
        if ReqUtil.transient?(response_or_exception) do
          get_retry_delay(request, response_or_exception)
        else
          nil
        end
      end,
      max_retries: Application.fetch_env!(:app, :api_max_retries)
    ]
  end

  defp get_retry_delay(request, %Req.Response{status: status} = response)
       when status in [429, 503] do
    retry_after = Req.Response.get_retry_after(response)

    if retry_after do
      {:delay, retry_after}
    else
      req_exponential_delay(request)
    end
  end

  defp get_retry_delay(request, _response) do
    req_exponential_delay(request)
  end

  defp req_exponential_delay(request) do
    retry_count = Req.Request.get_private(request, :req_retry_count, 0)
    {:delay, Util.retry_backoff_exponential(retry_count, retry_backoff_start_ms())}
  end

  defp retry_backoff_start_ms do
    Application.fetch_env!(:app, :api_retry_start_delay_ms)
  end

andreyvolokitin avatar Apr 10 '25 09:04 andreyvolokitin

@andreyvolokitin would returning {:retry, :safe_transient} like mentioned in https://github.com/wojtekmach/req/issues/371#issuecomment-2161436668 work for you?

wojtekmach avatar Apr 10 '25 09:04 wojtekmach

I can control retry delay and test backoff logic with these Req options, while still having Req to handle transient and Retry-After logic for me behind the scenes, so I no longer need any suggestions from this issue:

    [
      retry: :transient,
      retry_delay: &Util.retry_backoff_exponential(&1, Application.fetch_env!(:app, :api_retry_start_delay_ms)),
      max_retries: Application.fetch_env!(:app, :api_max_retries)
    ]

Regarding {:retry, :safe_transient}, for this (now irrelevant) use-case I needed to distinguish transient responses, hence the desire to make transient? public, but I'm not completely sure what {:retry, :safe_transient} would actually do. Though it looks like more abstract and less direct way to handle transient cases. My guess is that for completely custom retry logic (which is what retry option appears to do), public transient? is the most direct and flexible tool

andreyvolokitin avatar Apr 10 '25 16:04 andreyvolokitin