paws icon indicating copy to clipboard operation
paws copied to clipboard

Timeout and retry missing

Open aleaficionado opened this issue 4 years ago • 6 comments

API calls to AWS still timeout after 10 seconds even though the default timeout was changed to null in 0.3.8. I couldn't find a way to change the timeout.

Additionally, it seems that the retry functionality isn't implemented even though the code uses the HTTR lib which provides it.

I've attached a patch with the changes to paws.common that "fix" the above two issues via the config parameter (added timeout) and implemented in the required functions, but just to point out I have no idea how PAWS or even R are supposed to work.

network params.txt This is a diff but gh won't let me upload it as such.

Thanks, Adi

aleaficionado avatar Apr 29 '21 15:04 aleaficionado

Thank you!

Sorry about the issues. This is useful, and in fact we will also make these configurable, like paws::svc(config = list(max_retries = 5)).

The issue with timeouts is strange since as you stated the default was updated in the last release and should be no timeout. Would you be willing to try installing the latest paws.common from CRAN with install.packages("paws.common")?

In general paws.common gets installed automatically but if you have an older version installed it won't update automatically even if you update the paws package. I think what we'll do is on the next release we'll mandate installing >=0.3.8.

I think we'll want to keep the default as no timeout because some operations download lots of data or otherwise take a long time, which is what prompted the change in 0.3.8.

davidkretch avatar May 01 '21 13:05 davidkretch

Hi,

First of al great package. I don't think I had an older version of paws.common as I have never used R before submitting this issue, but I will check once I open the work laptop and confirm (we also had this for a couple of devs). To check for timeouts you can try issuing a dummy request to http://example.com:81/ and check the exact timeout you get though you might need to add a dummy service or the like. My feeling is that some library will set a default timeout of 10 seconds so unless you specify it, it will default to that.

The timeout should not relate to the length of the operation - it's the time between your request being sent and the acknowledgment from the server (and the other way around). So if your timeout is 1 second and you are calling an API that takes 1 minute to complete, the operation will be successful (baring network issues) as long as the server acknowledges your request within one second. You should be able to verify this by setting a timeout of 1 second and downloading or uploading a large file.

Generally you would want a defined timeout and retry as some HTTP requests do get lost (especially with corporate proxies) and you would hang the client with an infinite timeout. E.g. boto3 (or botocore) from AWS uses a timeout of 60 seconds and three retry attempts.

Cheers, A

aleaficionado avatar May 01 '21 18:05 aleaficionado

Ohhh, ok. Thank you! No need to check the package version. So after looking into it I realized that the timeout that the httr package uses is a max request duration, and there is a separate option for the timeout waiting for a connection, the latter of which does default to 10 seconds. I agree about the connection timeout and I'll make that update tomorrow.

davidkretch avatar May 02 '21 02:05 davidkretch

Hey, sorry for the delay. The connection timeout setting is now available on CRAN in the paws.common package. You can get the latest version, 0.3.10, with install.packages("paws.common"). However, if you are installing everything from scratch you won't have to explicitly install paws.common.

The default timeout is 60 seconds, and you can change it with the following syntax:

svc <- paws::svc(config = list(timeout = 10))

The retry logic is not yet implemented. Since not all errors are retryable, and some errors are only retryable based on the API level error code, it will take a little longer to implement that part.

davidkretch avatar May 15 '21 15:05 davidkretch

Hi,

Sorry about taking time to reply to this. I got dragged into something else. Isn't max attempts the same as retry logic?

aleaficionado avatar Aug 05 '21 15:08 aleaficionado

It just clicked. So the max_attempts option was not implemented. I agree there are some situations where you don't want to retry a request and there are cons to implementing the retry logic across the board, but there are also downsides to not having it available.

Timeout handling can be implemented in individual API calls outside of the lib, but this would make the code extremely verbose.

If it was implemented with max_attempts = 1 as default, then this would keep the current behaviour, but it would also allow users who want to implement it across the board (given the pros and cons) to do so.

aleaficionado avatar Aug 05 '21 16:08 aleaficionado